Description

Seminar and Course Structure

Set of broad ranging industry talks: Provide perspectives on
- Domain and business needs
- Infrastructure needs
- Analytics needs
- State of the art

Basic:
- 2 units: Class participation + 2 team reports on seminar themes (due 3/14 and 5/2)

Advanced (subject to review of CVs and approval)
-2 & 3 units: Potential projects, ideas, mentoring (including industry executives, VCs) and possible data (Reports due 2/8, 3/7, 4/11, 5/2)
Either i) Startup product development
OR
ii) Industry problem solution

Themes:
- Enterprise Analytics:
Enterprise databases (DB)and Business Intelligence (BI)
- Web Analytics
Leading to Hadoop, Spark/Shark, Streaming + Analytics
- Internet of Things
Continuous sensing and proactive response
What is new and different about it?

Data Science & Analytics:
Components
Data collection, storage, and basic processing
Architecture and Infrastructure
Analytics
Domain
Business Needs
To solve real Big Data problems, need expertise in some or all of these areas
Need to form teams!

Projects & Seminar Participation:
These are key to learning
Forming teams is critical
Need Analytics, infrastructure, business, domain
We can help
Faculty
Staff
Other students
Industry executives, managers, researchers and personnel
VCs
Requires submission of CV, proposal

Expected Background

Basic
An Introduction to Statistical Learning
(James, Witten, Hastie, Tibshirani)
R or equivalent
Data Mining, linear algebra, statistics, or equivalent
Additional (specialized): Field Experiments (Gerber, Green)
Background courses on next slide

Advanced: To discuss
- Coursera courses, EDX courses - Campus courses

Possible background Courses:
Big Data Analytics Background Resources
http://www-bcf.usc.edu/~gareth/ISL/
https://work.caltech.edu/telecourse.html
http://www.stat.berkeley.edu/~mjwain/Fall2012_Stat241a/
http://datascienc.es/
http://courses.ischool.berkeley.edu/i290-dma/s12/
https://blogs.ischool.berkeley.edu/i290-abdt-s12/author/hearst/
http://www.cs.berkeley.edu/~jordan/courses/294-fall09/
http://alex.smola.org/teaching/berkeley2012/
http://www.cs.berkeley.edu/~jordan/courses/281A-spring14/

Action
Sign up sheet
Set up teams
Provide CVs
Start determining data sets and projects
Meeting times, including Skype (beyond class times)
Set up boot camp times for Infrastructure and Machine Learning/Data Mining
Use Piazza!

Meeting Times:
Mon: 11-1 pm, (backup: 4/5 -5/6 pm), By appointment
Tue/Th: By appointment
Skype/tel, in addition to in-person meetings

Caveats on What the course is and is Not:

This is about addressing the unstructured real world and Silicon Valley
NOT a structured, course, with an organized, linear flow
You are expected to already know or learn data mining and machine learning
Bootcamp for those who need assistance
Seminars to provide industry context
Again, thematic, but no evident linear flow structure – executive schedules!
Industry and VC mentors for
Entrepreneurial project on data analytics product development

General Information

No information, yet. Stay tuned!

Announcements

Announcements are not public for this course.
Staff Office Hours
NameOffice Hours
Ramakrishna Akella
When?
Where?
Nikhil Mane
When?
Where?
Ramakrishna Akella
When?
Where?