Description

"Having lots of data is useless unless you know how to make sense of it and how to apply what you learn."
- Stephen Few

This course explores the internal design and construction of data management systems such as RDBMSes, Document Databases, and Consistent Data Stores. Students will be exposed to topics from the mathematical foundations of database theory, all the way through the implementation of those concepts in a practical setting.

By the end of the course students are expected to understand...
* ... the relational model, relational algebra, and the use of equivalencies to create a search space of potential query plans.
* ... the design and application of organizational datastructures including indexes, and paged files.
* ... the design and application of IO-aware algorithms and datastructures, including out-of-core and distributed data management techniques.
* ... techniques for data management in parallel and distributed settings.

General Information

Lecture Location
Knox 20; Monday/Wednesday 5:00-6:20
Academic Integrity Policy
You are expected to submit only work that you have performed yourself (or with your group for group projects). 

You may **discuss** approaches, strategies, and techniques with classmates.  Describing the general strategy that you/your group took to solve a problem is ok. Websites like Wikipedia, which discuss only concepts are ok.

You may **not** share solutions or code. Letting someone "look at your code" is not ok. Websites like StackExchange, where code is shared are not ok.

Violators will fail the class.

If a violation occurs on a group project, the entire group will be penalized. If someone else submits your work as their own, you will be penalized as well.

Announcements

Checkpoint #4 is now open
5/17/14 6:05 PM

Dear All,

It is now possible to submit your projects for checkpoint #4. Your code will be tested against six different queries, using the following command lines:

java -cp build/:lib/jsqlparser.jar:lib/junit-4.11.jar:lib/jdbm2.4.jar edu.buffalo.cse562.Main --build --data /Users/grader/Documents/autograder/engine/queries/chkpt4/b --swap /tmp/swap --index /tmp/idx /Users/grader/Documents/autograder/engine/queries/chkpt4/b/tpch_schemas.sql

java -cp build/:lib/jsqlparser.jar:lib/junit-4.11.jar:lib/jdbm2.4.jar edu.buffalo.cse562.Main --data /Users/grader/Documents/autograder/engine/queries/chkpt4/b --swap /tmp/swap --index /tmp/idx /Users/grader/Documents/autograder/engine/queries/chkpt4/b/tpch_schemas.sql /Users/grader/Documents/autograder/engine/queries/chkpt4/b/query01.sql
java -cp build/:lib/jsqlparser.jar:lib/junit-4.11.jar:lib/jdbm2.4.jar edu.buffalo.cse562.Main --data /Users/grader/Documents/autograder/engine/queries/chkpt4/b --swap /tmp/swap --index /tmp/idx /Users/grader/Documents/autograder/engine/queries/chkpt4/b/tpch_schemas.sql /Users/grader/Documents/autograder/engine/queries/chkpt4/b/query02.sql
java -cp build/:lib/jsqlparser.jar:lib/junit-4.11.jar:lib/jdbm2.4.jar edu.buffalo.cse562.Main --data /Users/grader/Documents/autograder/engine/queries/chkpt4/b --swap /tmp/swap --index /tmp/idx /Users/grader/Documents/autograder/engine/queries/chkpt4/b/tpch_schemas.sql /Users/grader/Documents/autograder/engine/queries/chkpt4/b/query03.sql
java -cp build/:lib/jsqlparser.jar:lib/junit-4.11.jar:lib/jdbm2.4.jar edu.buffalo.cse562.Main --data /Users/grader/Documents/autograder/engine/queries/chkpt4/b --swap /tmp/swap --index /tmp/idx /Users/grader/Documents/autograder/engine/queries/chkpt4/b/tpch_schemas.sql /Users/grader/Documents/autograder/engine/queries/chkpt4/b/query04.sql
java -cp build/:lib/jsqlparser.jar:lib/junit-4.11.jar:lib/jdbm2.4.jar edu.buffalo.cse562.Main --data /Users/grader/Documents/autograder/engine/queries/chkpt4/b --swap /tmp/swap --index /tmp/idx /Users/grader/Documents/autograder/engine/queries/chkpt4/b/tpch_schemas.sql /Users/grader/Documents/autograder/engine/queries/chkpt4/b/query05.sql
java -cp build/:lib/jsqlparser.jar:lib/junit-4.11.jar:lib/jdbm2.4.jar edu.buffalo.cse562.Main --data /Users/grader/Documents/autograder/engine/queries/chkpt4/b --swap /tmp/swap --index /tmp/idx /Users/grader/Documents/autograder/engine/queries/chkpt4/b/tpch_schemas.sql /Users/grader/Documents/autograder/engine/queries/chkpt4/b/query06.sql


The data set size is about 50 MBytes and each submission should take about 10-15 minutes. Please make sure to test your code offline before submitting. In order to do so, you can use a 25 MBytes data set that is available for download in the 'resources' section, together with queries and expected answers. (https://piazza.com/class_profile/get_resource/ho3bfjiiv3b1by/p18o636sgv118j8651jhkb6qmt8f)
Be aware that the order of execution is important (i.e. first query01, then query02 and so forth) since each query updates the state of the data base


The reference implementation performance is as follows:

Reference implementation execution time (query01): 22 seconds.
Reference implementation execution time (query02): 4.5 seconds.
Reference implementation execution time (query03): 4.5 seconds.
Reference implementation execution time (query04): 11.5 seconds.
Reference implementation execution time (query05): 26 seconds.
Reference implementation execution time (query06): 13.1 seconds.

The first four queries are worth 5 points each, the last two 6 points each, for a total of 32 points.

Please contact me if you need additional information.

Checkpoint #4 Sanity Check
5/15/14 12:32 PM

Sanity check is now available for Checkpoint #4. Please submit your projects to the grading machine. Each submission should take about 4 minutes.

A copy of both data and queries is available in the 'resources' section, for offline testing.

The teaching staff has posted a new projects resource.

Title: Checkpoint4_sanityCheck.zip


You can view it on the course page: https://piazza.com/buffalo/spring2014/cse562/resources

Checkpoint4_insert.zip has been added to class homepage under Resources
5/15/14 10:58 AM

Dear All,

Please find attached some test queries for Checkpoint #4, using "INSERT INTO" statements.

More queries (having "UPDATE" and "DELETE" statements) will follow.

Thanks.

The teaching staff has posted a new projects resource.

Title: Checkpoint4_insert.zip
http://www.piazza.com/class_profile/get_resource/ho3bfjiiv3b1by/hv86ed66i1d6d2
You can view it on the course page: https://piazza.com/buffalo/spring2014/cse562/resources

Final Exam Raw score
5/13/14 10:34 PM

Hi,

The final exam raw score has been uploaded on UbLearns. The final grades will be curved. The curved grade will be uploaded mostly by the end of this week.

Enjoy your holidays,

CSE 562 Teaching Staff.

Exam is today!
5/12/14 3:29 PM

Students,

It has been recently brought to our notice that some students believed the exam is on Thursday.

The exam is Today at 7.15pm in Knox 20.

Please check HUB to see your exam schedule

Grades for Assignment 7
5/12/14 12:52 PM

Assignment 7 grades are posted. You can collect your assignment during the special office hour today or later from Dr. Kennedy.

Special Office Hour
5/12/14 11:16 AM

You can meet the TA's from 2PM - 3PM today near the TA office.

Assignment 7
5/08/14 10:01 PM

Dear Student,

Solution for assignment 7 is posted. The focus is more on concept and conveying the procedure. In this regard, we have solved one example of bloom and the other is solved via semi join.

Regards,

G. Vishrawas

Staff Office Hours
NameOffice Hours
Oliver Kennedy
When?
Where?
Niccolo' Meneghetti
When?
Where?
Shounak Gore
When?
Where?
Vishrawas Gopalakrishnan
When?
Where?