Dataset: CityΒΆ
The original data from Chicago’s data portal contains detailed information for each crime and call to 311. We have split the city up into regions using a simple grid and have aggregated this data by region.
Each city data file contains data for different types of complaints
(that is, calls to 311) and the total amount of crimes on a per region
basis. The first row in the file contains column labels, for example,
GRAFFITI
or POT_HOLES
. Subsequent rows contain data for
different regions of the city. A column contains data for a given
variable across all the rows. For example, column with index 1 (the
second column) contains the number of calls about pot holes for each
region. In addition to information about specific types of
complaints, the file also has one column that contains the total
number crimes in each region.
Dependent variable: | |
---|---|
Total number of crimes in a region. The column name is |
|
Predictor variables: | |
Complaint variables. The first 7 columns, with indices defined in
|
|
File paths: | data/city/training.csv
data/city/testing.csv
|
Task 1 expected output: | |
---|---|
CITY Task 1a: CITY Task 1b: |
|
Task 2 expected output: | |
CITY Task 2: |
|
Task 3 expected output: | |
CITY Task 3a: |
|
Task 4 expected output: | |
CITY Task 4: |