Description

Introduction to data science by using tools and writing programs for acquiring, cleaning, analyzing, exploring, and visualizing data; making data-driven inferences and decisions; and effectively communicating results. Learning data manipulation, data analysis with statistics and machine learning, data communication with information visualization, working with big data using scalable techniques.

Four major goals:
- data wrangling: how to acquire, clean, reshape or sample data so that it’s ready for further processing?
- data exploration: how to analyze the signal in a large, noisy dataset?
- prediction: can inferences and decisions be made based on the available data?
- communication: how can findings be effectively communicated to others?

General Information

Lectures
Wed 15:20-16:10 H313, Thu 11:20-13:10 NB06
Textbook
(Required) Data Science from Scratch: First Principles with Python, Joel Grus, O’Reilly, ISBN: 149190142X
http://shop.oreilly.com/product/0636920033400.do
Code: https://github.com/joelgrus/data-science-from-scratch

(Optional) Python Data Science Handbook, Jake VanderPlas, O'Reilly, 2016
(free) https://jakevdp.github.io/PythonDataScienceHandbook/
Code: https://github.com/jakevdp/PythonDataScienceHandbook

Announcements

Project submissions
1/13/19 12:50 AM

Submit your project notebooks (.ipynb) to webonline by Tuesday the latest.

Make sure you have a lot of comments as well the following:

- Team and project info

- Dataset (link to source, explanation of data)

- Step by step instructions for code

- Interpretation of graphical or tabular results (what you learned)

assignment, quiz, lab codes
Project ideas
11/21/18 3:36 PM

Please reply to this post with the following:

  • Team name or acronym
  • Team members (at most 2 students)
  • Title
  • Data (explain where you will get the data from, the source of it – whether downloaded, scraped from the web, or obtained via an API, the expected size of the data, and the data elements in the set)
  • Description (explain the aim of the project)
  • Goals (what data analysis results do you expect to achieve?)
  • Tools (libraries, etc.)
Datasets for Project
11/12/18 9:29 PM

Hello everyone,
I have listed examples of datasets for the group project, you can use the following list:

1. Video Game Sales with Ratings
Link: https://www.kaggle.com/rush4ratio/video-game-sales-with-ratings

2. Students' Academic Performance Dataset
Link: https://www.kaggle.com/aljarah/xAPI-Edu-Data

3. Video Game Sales
Link: https://www.kaggle.com/gregorut/videogamesales

4. Super Heroes Dataset
Link: https://www.kaggle.com/claudiodavi/superhero-set#heroes_information.csv

5. Pokémon for Data Mining and Machine Learning
Link: https://www.kaggle.com/alopez247/pokemon

6. Game of Thrones
Link: https://www.kaggle.com/mylesoneill/game-of-thrones

7. Google Play Store Apps
Link: https://www.kaggle.com/lava18/google-play-store-apps

8. IMDb Top 50 Movies
Link: https://www.kaggle.com/aditya1303/imdb-top-50-movies

9. Chocolate Bar Ratings
Link: https://www.kaggle.com/rtatman/chocolate-bar-ratings

10. World Happiness Report
Link: https://www.kaggle.com/unsdsn/world-happiness

11. FIFA World Cup
Link: https://www.kaggle.com/abecklas/fifa-world-cup

12. Human Resources Data Set
Link: https://www.kaggle.com/rhuebner/human-resources-data-set

Note: If you have appropriate dataset in above list, Please inform to Erdoğan teacher about which dataset to use before starting the project.

Regards and happy coding!

Akın.

Example: Shakespeare Word Count (at google colab)
11/11/18 1:28 PM

Akın pointed out to https://colab.research.google.com

It is a great please to write python code in jupyter mode online. Both editing and executing code, as well as sharing with others.

I wrote my first ipynb on colab and here it is, from the last class. Counting the words in Shakespeare's works.

Title: Example: Shakespeare Word Count

https://colab.research.google.com/drive/1wHqesI7j2R-F0W4vq4VPuW3jtAz-tBG5

If you have a google account, go to your drive, add a new file and check Collaboratory. That's it, you are there to create your first ipynb. Happy coding.

Lecture date: Nov 8, 2018

You can view it on the course page: https://piazza.com/cankaya.edu.tr/fall2018/ceng499/resources

Asg 3 - twitter_api.ipynb (Fall 2018) has been added to class homepage under Resources
11/08/18 4:40 PM

The teaching staff has posted a new homework resource.

Title: Asg 3 - twitter_api.ipynb (Fall 2018)
http://www.piazza.com/class_profile/get_resource/jmkg3poi53j3zm/jo8n1fp8qjh17h

Due date: Nov 15, 2018

You can view it on the course page: https://piazza.com/cankaya.edu.tr/fall2018/ceng499/resources

Asg1 - Intro to Python for DS
10/03/18 3:13 PM

Welcome to Python and Data Science world.

İlk ödeviniz burada. 

Title: Asg1 - Intro to Python for DS
http://www.piazza.com/class_profile/get_resource/jmkg3poi53j3zm/jmt3ov4fmy86le

Due date: Oct 10, 2018

You can view it on the course page: https://piazza.com/cankaya.edu.tr/fall2018/ceng499/resources

HW1 - Learn python
9/27/18 3:35 PM

Please complete the online course below to show that you know python basic. You have two weeks.

https://www.datacamp.com/courses/intro-to-python-for-data-science

Staff Office Hours
NameOffice Hours
Erdoğan Doğdu
When?
Where?