The pandas library

pandas is a data analysis toolkit for Python. It makes it easy to work with certain types of data, specially:

  • Tabular data (i.e., anything that can be expressed as labelled columns and rows)
  • Time series data
  • Matrix data
  • Statistical datasets

We will see some of pandas' features by working with a dataset of 311 service requests from New York City. This data is stored in a CSV file, which you can open with most spreadsheet software.

[Open the file in LibreOffice/Excel/etc.]

Each row in this file represents a service call (or "complaint") to 311 in New York City. The file has 51 columns, including columns such as:

  • Unique Key: A unique identifier for the service call
  • Created Date
  • Agency / Agency Name
  • Complaint Type / Descriptor
  • Incident Address

Etc.

Given this data, let's say we wanted to answer the following questions:

  1. What are the top 10 complaint types?
  2. How many noise complaints were there in a given borough?
  3. What is the top complaint in each borough?

With what we've seen so far this quarter, we already have the ability to answer these questions. We could load this data into a list of complaints, where each complaint is represented as tuple, dictionary, or object, and then use a variety of loops to process that data. However, pandas provides us with higher-level abstractions that will make it much easier to answer these questions. On top of that, pandas also handles many aspects of working with real-world data sets, like dealing with missing data or converting data to the right type (if we read the file manually, we would start with strings for everything, and would have to make sure everything was converted to the right type)

To import the pandas library we do this:

In [1]:
import pandas as pd

The most important data structure in pandas is the DataFrame, which represents tabular data, i.e., a table with rows and columns. For our purposes, we can assume that columns will always have labels (i.e., that a column always has a "name"). If we look at the 311 CSV file in the spreadsheet, this is the header of each column.

Rows can also have unique labels (this is called an "index"). This allows us to refer to individual rows by some unique identifier. If the dataset doesn't already include such an identifier, then pandas will simply label the rows from 0.

So, we are going to load the 311 CSV file using the read_csv function. It will automatically detect that the first row contains the column labels, but we need to explicitly tell read_csv that the row labels are contained in column Unique Key

In [131]:
complaints = pd.read_csv('311-service-requests.csv', index_col='Unique Key')

complaints is a DataFrame. You can think of it as containing a table representing the data in the file.

In [4]:
complaints
Out[4]:
Created Date Closed Date Agency Agency Name Complaint Type Descriptor Location Type Incident Zip Incident Address Street Name ... Bridge Highway Name Bridge Highway Direction Road Ramp Bridge Highway Segment Garage Lot Name Ferry Direction Ferry Terminal Name Latitude Longitude Location
Unique Key
26589651 10/31/2013 02:08:41 AM NaN NYPD New York City Police Department Noise - Street/Sidewalk Loud Talking Street/Sidewalk 11432 90-03 169 STREET 169 STREET ... NaN NaN NaN NaN NaN NaN NaN 40.708275 -73.791604 (40.70827532593202, -73.79160395779721)
26593698 10/31/2013 02:01:04 AM NaN NYPD New York City Police Department Illegal Parking Commercial Overnight Parking Street/Sidewalk 11378 58 AVENUE 58 AVENUE ... NaN NaN NaN NaN NaN NaN NaN 40.721041 -73.909453 (40.721040535628305, -73.90945306791765)
26594139 10/31/2013 02:00:24 AM 10/31/2013 02:40:32 AM NYPD New York City Police Department Noise - Commercial Loud Music/Party Club/Bar/Restaurant 10032 4060 BROADWAY BROADWAY ... NaN NaN NaN NaN NaN NaN NaN 40.843330 -73.939144 (40.84332975466513, -73.93914371913482)
26595721 10/31/2013 01:56:23 AM 10/31/2013 02:21:48 AM NYPD New York City Police Department Noise - Vehicle Car/Truck Horn Street/Sidewalk 10023 WEST 72 STREET WEST 72 STREET ... NaN NaN NaN NaN NaN NaN NaN 40.778009 -73.980213 (40.7780087446372, -73.98021349023975)
26590930 10/31/2013 01:53:44 AM NaN DOHMH Department of Health and Mental Hygiene Rodent Condition Attracting Rodents Vacant Lot 10027 WEST 124 STREET WEST 124 STREET ... NaN NaN NaN NaN NaN NaN NaN 40.807691 -73.947387 (40.80769092704951, -73.94738703491433)
26592370 10/31/2013 01:46:52 AM NaN NYPD New York City Police Department Noise - Commercial Banging/Pounding Club/Bar/Restaurant 11372 37 AVENUE 37 AVENUE ... NaN NaN NaN NaN NaN NaN NaN 40.749989 -73.881988 (40.7499893014072, -73.88198770727831)
26595682 10/31/2013 01:46:40 AM NaN NYPD New York City Police Department Blocked Driveway No Access Street/Sidewalk 11419 107-50 109 STREET 109 STREET ... NaN NaN NaN NaN NaN NaN NaN 40.681533 -73.831737 (40.68153278675525, -73.83173699701601)
26595195 10/31/2013 01:44:19 AM 10/31/2013 01:58:49 AM NYPD New York City Police Department Noise - Commercial Loud Music/Party Club/Bar/Restaurant 11417 137-09 CROSSBAY BOULEVARD CROSSBAY BOULEVARD ... NaN NaN NaN NaN NaN NaN NaN 40.671816 -73.843092 (40.67181584567338, -73.84309181950769)
26590540 10/31/2013 01:44:14 AM 10/31/2013 02:28:04 AM NYPD New York City Police Department Noise - Commercial Loud Talking Club/Bar/Restaurant 10011 258 WEST 15 STREET WEST 15 STREET ... NaN NaN NaN NaN NaN NaN NaN 40.739913 -74.000790 (40.73991339303542, -74.00079028612932)
26594392 10/31/2013 01:34:41 AM 10/31/2013 02:23:51 AM NYPD New York City Police Department Noise - Commercial Loud Music/Party Club/Bar/Restaurant 11225 835 NOSTRAND AVENUE NOSTRAND AVENUE ... NaN NaN NaN NaN NaN NaN NaN 40.668204 -73.950648 (40.66820406598287, -73.95064760056546)
26595176 10/31/2013 01:25:12 AM NaN NYPD New York City Police Department Noise - House of Worship Loud Music/Party House of Worship 11218 3775 18 AVENUE 18 AVENUE ... NaN NaN NaN NaN NaN NaN NaN 40.634378 -73.969462 (40.63437840816299, -73.96946177104543)
26591982 10/31/2013 01:24:14 AM 10/31/2013 01:54:39 AM NYPD New York City Police Department Noise - Commercial Loud Music/Party Club/Bar/Restaurant 10003 187 2 AVENUE 2 AVENUE ... NaN NaN NaN NaN NaN NaN NaN 40.730816 -73.986073 (40.73081644089586, -73.98607265739876)
26594169 10/31/2013 01:20:57 AM 10/31/2013 02:12:31 AM NYPD New York City Police Department Illegal Parking Double Parked Blocking Vehicle Street/Sidewalk 10029 65 EAST 99 STREET EAST 99 STREET ... NaN NaN NaN NaN NaN NaN NaN 40.788974 -73.952259 (40.78897400211689, -73.95225898702977)
26594391 10/31/2013 01:20:13 AM NaN NYPD New York City Police Department Noise - Vehicle Engine Idling Street/Sidewalk 10466 NaN NaN ... NaN NaN NaN NaN NaN NaN NaN 40.891517 -73.836457 (40.89151738488846, -73.83645714593568)
26590917 10/31/2013 01:19:54 AM NaN DOHMH Department of Health and Mental Hygiene Rodent Rat Sighting 1-2 Family Mixed Use Building 11219 63 STREET 63 STREET ... NaN NaN NaN NaN NaN NaN NaN 40.626477 -73.999218 (40.6264774690411, -73.99921826202639)
26591458 10/31/2013 01:14:02 AM 10/31/2013 01:30:34 AM NYPD New York City Police Department Noise - House of Worship Loud Music/Party House of Worship 10025 NaN NaN ... NaN NaN NaN NaN NaN NaN NaN 40.796597 -73.970370 (40.7965967075252, -73.97036973473399)
26594086 10/31/2013 12:54:03 AM 10/31/2013 02:16:39 AM NYPD New York City Police Department Noise - Street/Sidewalk Loud Music/Party Street/Sidewalk 10310 173 CAMPBELL AVENUE CAMPBELL AVENUE ... NaN NaN NaN NaN NaN NaN NaN 40.636182 -74.116150 (40.63618202176914, -74.1161500428337)
26595117 10/31/2013 12:52:46 AM NaN NYPD New York City Police Department Illegal Parking Posted Parking Sign Violation Street/Sidewalk 11236 NaN NaN ... NaN NaN NaN NaN NaN NaN NaN 40.632437 -73.888173 (40.63243692394328, -73.88817263437012)
26590389 10/31/2013 12:51:00 AM NaN DOT Department of Transportation Street Light Condition Street Light Out NaN NaN 226 42 ST E 42 ST E ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
26594210 10/31/2013 12:46:27 AM NaN NYPD New York City Police Department Noise - Commercial Loud Music/Party Club/Bar/Restaurant 10033 NaN NaN ... NaN NaN NaN NaN NaN NaN NaN 40.852058 -73.934776 (40.85205827756883, -73.93477640780834)
26592932 10/31/2013 12:43:47 AM 10/31/2013 12:56:20 AM NYPD New York City Police Department Noise - House of Worship Loud Music/Party House of Worship 11216 778 PARK PLACE PARK PLACE ... NaN NaN NaN NaN NaN NaN NaN 40.673505 -73.951844 (40.67350473678714, -73.95184414979961)
26594152 10/31/2013 12:41:17 AM 10/31/2013 01:04:37 AM NYPD New York City Police Department Noise - Commercial Banging/Pounding Store/Commercial 10016 155 E 34TH ST E 34TH ST ... NaN NaN NaN NaN NaN NaN NaN 40.746194 -73.978769 (40.74619417253121, -73.97876853124392)
26589678 10/31/2013 12:39:55 AM NaN NYPD New York City Police Department Noise - Vehicle Car/Truck Music Street/Sidewalk 11419 NaN NaN ... NaN NaN NaN NaN NaN NaN NaN 40.692394 -73.833891 (40.69239424979043, -73.8338912453996)
26592304 10/31/2013 12:38:00 AM NaN NYPD New York City Police Department Noise - Commercial Loud Music/Party Club/Bar/Restaurant 11216 371 TOMPKINS AVENUE TOMPKINS AVENUE ... NaN NaN NaN NaN NaN NaN NaN 40.684944 -73.944221 (40.6849442562592, -73.94422078036632)
26591892 10/31/2013 12:37:16 AM NaN NYPD New York City Police Department Blocked Driveway Partial Access Street/Sidewalk 10305 1496 BAY STREET BAY STREET ... NaN NaN NaN NaN NaN NaN NaN 40.607245 -74.061106 (40.60724493456944, -74.06110566015863)
26591573 10/31/2013 12:35:18 AM 10/31/2013 02:41:35 AM NYPD New York City Police Department Noise - Street/Sidewalk Loud Talking Street/Sidewalk 10312 24 PRINCETON LANE PRINCETON LANE ... NaN NaN NaN NaN NaN NaN NaN 40.553421 -74.196743 (40.55342078716953, -74.19674315017886)
26590509 10/31/2013 12:33:00 AM NaN DOT Department of Transportation Street Light Condition Street Light Out NaN NaN 38 ST E 38 ST E ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
26591379 10/31/2013 12:32:44 AM NaN DOHMH Department of Health and Mental Hygiene Harboring Bees/Wasps Bees/Wasps - Not a beekeper 3+ Family Mixed Use Building 10025 501 WEST 110 STREET WEST 110 STREET ... NaN NaN NaN NaN NaN NaN NaN 40.803149 -73.964266 (40.80314938553783, -73.96426608076082)
26594085 10/31/2013 12:32:08 AM NaN NYPD New York City Police Department Noise - Street/Sidewalk Loud Talking Street/Sidewalk 10026 121 WEST 116 STREET WEST 116 STREET ... NaN NaN NaN NaN NaN NaN NaN 40.802390 -73.950526 (40.80238950799943, -73.95052644123253)
26589201 10/31/2013 12:32:00 AM NaN DOT Department of Transportation Street Light Condition Street Light Out NaN 10309 295 BAYVIEW AVENUE BAYVIEW AVENUE ... NaN NaN NaN NaN NaN NaN NaN 40.517378 -74.203435 (40.517377871705676, -74.20343466779575)
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
26428764 10/04/2013 12:17:03 AM 10/04/2013 12:38:37 AM NYPD New York City Police Department Noise - Commercial Loud Music/Party Club/Bar/Restaurant 10022 249 EAST 53 STREET EAST 53 STREET ... NaN NaN NaN NaN NaN NaN NaN 40.757248 -73.968286 (40.757247604963055, -73.96828647941395)
26426166 10/04/2013 12:16:22 AM 10/04/2013 05:50:49 AM NYPD New York City Police Department Noise - Commercial Loud Music/Party Store/Commercial 10029 252 EAST 110 STREET EAST 110 STREET ... NaN NaN NaN NaN NaN NaN NaN 40.793635 -73.941649 (40.79363516179995, -73.94164859515777)
26438565 10/04/2013 12:16:00 AM NaN DEP Department of Environmental Protection Noise Noise: Construction Before/After Hours (NM1) NaN 11231 480 VAN BRUNT STREET VAN BRUNT STREET ... NaN NaN NaN NaN NaN NaN NaN 40.674249 -74.016558 (40.6742492231181, -74.01655803856313)
26428990 10/04/2013 12:15:52 AM 10/04/2013 12:44:52 AM NYPD New York City Police Department Noise - Street/Sidewalk Loud Talking Street/Sidewalk 10014 733 WASHINGTON STREET WASHINGTON STREET ... NaN NaN NaN NaN NaN NaN NaN 40.736304 -74.008299 (40.736303747410034, -74.00829935904578)
26432659 10/04/2013 12:15:46 AM 10/04/2013 04:18:45 AM NYPD New York City Police Department Noise - Commercial Loud Music/Party Club/Bar/Restaurant 11218 1213 CORTELYOU ROAD CORTELYOU ROAD ... NaN NaN NaN NaN NaN NaN NaN 40.640139 -73.966847 (40.64013899178716, -73.96684680242933)
26426096 10/04/2013 12:14:09 AM 10/04/2013 01:03:46 AM NYPD New York City Police Department Noise - Street/Sidewalk Loud Talking Street/Sidewalk 10032 539 WEST 162 STREET WEST 162 STREET ... NaN NaN NaN NaN NaN NaN NaN 40.836532 -73.941018 (40.836532082987944, -73.9410182995914)
26437764 10/04/2013 12:14:00 AM 10/04/2013 12:14:00 AM DEP Department of Environmental Protection Water System Dirty Water (WE) NaN 10022 251 EAST 51 STREET EAST 51 STREET ... NaN NaN NaN NaN NaN NaN NaN 40.755980 -73.969171 (40.75597977288785, -73.96917140638074)
26436286 10/04/2013 12:14:00 AM NaN DEP Department of Environmental Protection Noise Noise: Construction Before/After Hours (NM1) NaN 11231 480 VAN BRUNT STREET VAN BRUNT STREET ... NaN NaN NaN NaN NaN NaN NaN 40.674249 -74.016558 (40.6742492231181, -74.01655803856313)
26428989 10/04/2013 12:13:08 AM 10/04/2013 02:12:47 AM NYPD New York City Police Department Illegal Parking Posted Parking Sign Violation Street/Sidewalk 11434 NaN NaN ... NaN NaN NaN NaN NaN NaN NaN 40.656160 -73.767353 (40.656160351546845, -73.76735262738222)
26430030 10/04/2013 12:12:07 AM 10/04/2013 02:45:24 AM NYPD New York City Police Department Noise - Street/Sidewalk Loud Talking Street/Sidewalk 10027 215 WEST 131ST STREET WEST 131ST STREET ... NaN NaN NaN NaN NaN NaN NaN 40.813339 -73.946328 (40.81333907832113, -73.94632769228208)
26429663 10/04/2013 12:12:07 AM 10/04/2013 01:03:44 AM NYPD New York City Police Department Noise - Commercial Loud Music/Party Club/Bar/Restaurant 11209 8915 5 AVENUE 5 AVENUE ... NaN NaN NaN NaN NaN NaN NaN 40.619601 -74.027826 (40.619601147364584, -74.02782628492785)
26437763 10/04/2013 12:11:00 AM NaN DEP Department of Environmental Protection Noise Noise: Construction Before/After Hours (NM1) NaN 10028 NaN NaN ... NaN NaN NaN NaN NaN NaN NaN 40.774612 -73.948085 (40.77461174278734, -73.94808472177321)
26432955 10/04/2013 12:08:15 AM 10/04/2013 12:48:02 AM NYPD New York City Police Department Noise - Commercial Loud Music/Party Club/Bar/Restaurant 10009 506 EAST 13 STREET EAST 13 STREET ... NaN NaN NaN NaN NaN NaN NaN 40.729531 -73.980416 (40.72953107218499, -73.98041550317102)
26437035 10/04/2013 12:08:00 AM 10/04/2013 12:13:00 AM DEP Department of Environmental Protection Water System Dirty Water (WE) NaN 10022 325 EAST 54 STREET EAST 54 STREET ... NaN NaN NaN NaN NaN NaN NaN 40.757069 -73.965933 (40.75706852462872, -73.96593314322774)
26433197 10/04/2013 12:08:00 AM 10/04/2013 12:00:00 PM DSNY BCC - Queens East Derelict Vehicles 14 Derelict Vehicles Street 11413 220-11 145 AVENUE 145 AVENUE ... NaN NaN NaN NaN NaN NaN NaN 40.664353 -73.757556 (40.6643525308449, -73.75755575708348)
26426060 10/04/2013 12:06:39 AM 10/04/2013 12:31:16 AM NYPD New York City Police Department Noise - Street/Sidewalk Loud Talking Street/Sidewalk 11224 4823 BEACH 48 STREET BEACH 48 STREET ... NaN NaN NaN NaN NaN NaN NaN 40.577515 -74.012207 (40.57751513866536, -74.01220705163807)
26430628 10/04/2013 12:06:28 AM 10/04/2013 12:21:39 AM NYPD New York City Police Department Noise - Commercial Loud Talking Club/Bar/Restaurant 11209 7915 3 AVENUE 3 AVENUE ... NaN NaN NaN NaN NaN NaN NaN 40.628381 -74.029040 (40.62838143294541, -74.02904041333245)
26431648 10/04/2013 12:06:26 AM 10/23/2013 08:14:52 AM DOT Department of Transportation Street Sign - Missing Bus Stop Street 11378 NaN NaN ... NaN NaN NaN NaN NaN NaN NaN 40.732250 -73.924513 (40.73225000573841, -73.92451289173367)
26437034 10/04/2013 12:06:00 AM NaN DEP Department of Environmental Protection Noise Noise: Jack Hammering (NC2) NaN 10036 NaN NaN ... NaN NaN NaN NaN NaN NaN NaN 40.760116 -73.984836 (40.76011648520396, -73.98483562074706)
26426094 10/04/2013 12:05:12 AM 10/04/2013 01:08:29 AM NYPD New York City Police Department Noise - Commercial Loud Music/Party Club/Bar/Restaurant 11237 211 KNICKERBOCKER AVENUE KNICKERBOCKER AVENUE ... NaN NaN NaN NaN NaN NaN NaN 40.703365 -73.926345 (40.703365329011284, -73.92634531031759)
26429040 10/04/2013 12:04:52 AM 10/04/2013 03:01:04 AM NYPD New York City Police Department Noise - Street/Sidewalk Loud Talking Street/Sidewalk 10003 99 2 AVENUE 2 AVENUE ... NaN NaN NaN NaN NaN NaN NaN 40.727251 -73.988660 (40.727251287038676, -73.98866028603422)
26434084 10/04/2013 12:04:00 AM NaN DEP Department of Environmental Protection Noise Noise: Construction Before/After Hours (NM1) NaN 10036 NaN NaN ... NaN NaN NaN NaN NaN NaN NaN 40.760405 -73.987474 (40.76040500039782, -73.98747426235285)
26426164 10/04/2013 12:03:00 AM 10/04/2013 02:14:57 AM NYPD New York City Police Department Noise - Commercial Loud Music/Party Club/Bar/Restaurant 11106 30-09 BROADWAY BROADWAY ... NaN NaN NaN NaN NaN NaN NaN 40.762279 -73.926013 (40.762278541098084, -73.92601303458156)
26439710 10/04/2013 12:03:00 AM 10/04/2013 12:03:00 AM DEP Department of Environmental Protection Water System Dirty Water (WE) NaN 10022 325 EAST 54 STREET EAST 54 STREET ... NaN NaN NaN NaN NaN NaN NaN 40.757069 -73.965933 (40.75706852462872, -73.96593314322774)
26435569 10/04/2013 12:02:00 AM 10/04/2013 01:10:00 AM DEP Department of Environmental Protection Water System Dirty Water (WE) NaN 10022 311 EAST 50 STREET EAST 50 STREET ... NaN NaN NaN NaN NaN NaN NaN 40.754662 -73.967992 (40.75466197318078, -73.96799173864807)
26426013 10/04/2013 12:01:13 AM 10/07/2013 04:07:16 PM DPR Department of Parks and Recreation Maintenance or Facility Structure - Outdoors Park 11213 NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
26428083 10/04/2013 12:01:05 AM 10/04/2013 02:13:50 AM NYPD New York City Police Department Illegal Parking Posted Parking Sign Violation Street/Sidewalk 11434 NaN NaN ... NaN NaN NaN NaN NaN NaN NaN 40.656160 -73.767353 (40.656160351546845, -73.76735262738222)
26428987 10/04/2013 12:00:45 AM 10/04/2013 01:25:01 AM NYPD New York City Police Department Noise - Street/Sidewalk Loud Talking Street/Sidewalk 10016 344 EAST 28 STREET EAST 28 STREET ... NaN NaN NaN NaN NaN NaN NaN 40.740295 -73.976952 (40.740295354643706, -73.97695165980414)
26426115 10/04/2013 12:00:28 AM 10/04/2013 04:17:32 AM NYPD New York City Police Department Noise - Commercial Loud Talking Club/Bar/Restaurant 11226 1233 FLATBUSH AVENUE FLATBUSH AVENUE ... NaN NaN NaN NaN NaN NaN NaN 40.640182 -73.955306 (40.64018174662485, -73.95530566958138)
26428033 10/04/2013 12:00:10 AM 10/04/2013 01:20:52 AM NYPD New York City Police Department Blocked Driveway Partial Access Street/Sidewalk 11236 1259 EAST 94 STREET EAST 94 STREET ... NaN NaN NaN NaN NaN NaN NaN 40.640024 -73.900717 (40.640024057399216, -73.90071711703163)

111069 rows × 51 columns

A few basic operations on DataFrames:

In [5]:
complaints.shape
Out[5]:
(111069, 51)
In [88]:
complaints.columns
Out[88]:
Index(['Created Date', 'Closed Date', 'Agency', 'Agency Name',
       'Complaint Type', 'Descriptor', 'Location Type', 'Incident Zip',
       'Incident Address', 'Street Name', 'Cross Street 1', 'Cross Street 2',
       'Intersection Street 1', 'Intersection Street 2', 'Address Type',
       'City', 'Landmark', 'Facility Type', 'Status', 'Due Date',
       'Resolution Action Updated Date', 'Community Board', 'Borough',
       'X Coordinate (State Plane)', 'Y Coordinate (State Plane)',
       'Park Facility Name', 'Park Borough', 'School Name', 'School Number',
       'School Region', 'School Code', 'School Phone Number', 'School Address',
       'School City', 'School State', 'School Zip', 'School Not Found',
       'School or Citywide Complaint', 'Vehicle Type', 'Taxi Company Borough',
       'Taxi Pick Up Location', 'Bridge Highway Name',
       'Bridge Highway Direction', 'Road Ramp', 'Bridge Highway Segment',
       'Garage Lot Name', 'Ferry Direction', 'Ferry Terminal Name', 'Latitude',
       'Longitude', 'Location'],
      dtype='object')
In [7]:
complaints[:5]
Out[7]:
Created Date Closed Date Agency Agency Name Complaint Type Descriptor Location Type Incident Zip Incident Address Street Name ... Bridge Highway Name Bridge Highway Direction Road Ramp Bridge Highway Segment Garage Lot Name Ferry Direction Ferry Terminal Name Latitude Longitude Location
Unique Key
26589651 10/31/2013 02:08:41 AM NaN NYPD New York City Police Department Noise - Street/Sidewalk Loud Talking Street/Sidewalk 11432 90-03 169 STREET 169 STREET ... NaN NaN NaN NaN NaN NaN NaN 40.708275 -73.791604 (40.70827532593202, -73.79160395779721)
26593698 10/31/2013 02:01:04 AM NaN NYPD New York City Police Department Illegal Parking Commercial Overnight Parking Street/Sidewalk 11378 58 AVENUE 58 AVENUE ... NaN NaN NaN NaN NaN NaN NaN 40.721041 -73.909453 (40.721040535628305, -73.90945306791765)
26594139 10/31/2013 02:00:24 AM 10/31/2013 02:40:32 AM NYPD New York City Police Department Noise - Commercial Loud Music/Party Club/Bar/Restaurant 10032 4060 BROADWAY BROADWAY ... NaN NaN NaN NaN NaN NaN NaN 40.843330 -73.939144 (40.84332975466513, -73.93914371913482)
26595721 10/31/2013 01:56:23 AM 10/31/2013 02:21:48 AM NYPD New York City Police Department Noise - Vehicle Car/Truck Horn Street/Sidewalk 10023 WEST 72 STREET WEST 72 STREET ... NaN NaN NaN NaN NaN NaN NaN 40.778009 -73.980213 (40.7780087446372, -73.98021349023975)
26590930 10/31/2013 01:53:44 AM NaN DOHMH Department of Health and Mental Hygiene Rodent Condition Attracting Rodents Vacant Lot 10027 WEST 124 STREET WEST 124 STREET ... NaN NaN NaN NaN NaN NaN NaN 40.807691 -73.947387 (40.80769092704951, -73.94738703491433)

5 rows × 51 columns

We can access an individual column using square brackets. This returns a Series object, which represents a one-dimensional labelled array. In this case, the array is labelled by Unique Key.

In [8]:
complaints['Complaint Type']
Out[8]:
Unique Key
26589651     Noise - Street/Sidewalk
26593698             Illegal Parking
26594139          Noise - Commercial
26595721             Noise - Vehicle
26590930                      Rodent
26592370          Noise - Commercial
26595682            Blocked Driveway
26595195          Noise - Commercial
26590540          Noise - Commercial
26594392          Noise - Commercial
26595176    Noise - House of Worship
26591982          Noise - Commercial
26594169             Illegal Parking
26594391             Noise - Vehicle
26590917                      Rodent
26591458    Noise - House of Worship
26594086     Noise - Street/Sidewalk
26595117             Illegal Parking
26590389      Street Light Condition
26594210          Noise - Commercial
26592932    Noise - House of Worship
26594152          Noise - Commercial
26589678             Noise - Vehicle
26592304          Noise - Commercial
26591892            Blocked Driveway
26591573     Noise - Street/Sidewalk
26590509      Street Light Condition
26591379        Harboring Bees/Wasps
26594085     Noise - Street/Sidewalk
26589201      Street Light Condition
                      ...           
26428764          Noise - Commercial
26426166          Noise - Commercial
26438565                       Noise
26428990     Noise - Street/Sidewalk
26432659          Noise - Commercial
26426096     Noise - Street/Sidewalk
26437764                Water System
26436286                       Noise
26428989             Illegal Parking
26430030     Noise - Street/Sidewalk
26429663          Noise - Commercial
26437763                       Noise
26432955          Noise - Commercial
26437035                Water System
26433197           Derelict Vehicles
26426060     Noise - Street/Sidewalk
26430628          Noise - Commercial
26431648       Street Sign - Missing
26437034                       Noise
26426094          Noise - Commercial
26429040     Noise - Street/Sidewalk
26434084                       Noise
26426164          Noise - Commercial
26439710                Water System
26435569                Water System
26426013     Maintenance or Facility
26428083             Illegal Parking
26428987     Noise - Street/Sidewalk
26426115          Noise - Commercial
26428033            Blocked Driveway
Name: Complaint Type, dtype: object

However, if we select multiple columns, we get back a DataFrame:

In [9]:
complaints[['Complaint Type', 'Borough']]
Out[9]:
Complaint Type Borough
Unique Key
26589651 Noise - Street/Sidewalk QUEENS
26593698 Illegal Parking QUEENS
26594139 Noise - Commercial MANHATTAN
26595721 Noise - Vehicle MANHATTAN
26590930 Rodent MANHATTAN
26592370 Noise - Commercial QUEENS
26595682 Blocked Driveway QUEENS
26595195 Noise - Commercial QUEENS
26590540 Noise - Commercial MANHATTAN
26594392 Noise - Commercial BROOKLYN
26595176 Noise - House of Worship BROOKLYN
26591982 Noise - Commercial MANHATTAN
26594169 Illegal Parking MANHATTAN
26594391 Noise - Vehicle BRONX
26590917 Rodent BROOKLYN
26591458 Noise - House of Worship MANHATTAN
26594086 Noise - Street/Sidewalk STATEN ISLAND
26595117 Illegal Parking BROOKLYN
26590389 Street Light Condition BROOKLYN
26594210 Noise - Commercial MANHATTAN
26592932 Noise - House of Worship BROOKLYN
26594152 Noise - Commercial MANHATTAN
26589678 Noise - Vehicle QUEENS
26592304 Noise - Commercial BROOKLYN
26591892 Blocked Driveway STATEN ISLAND
26591573 Noise - Street/Sidewalk STATEN ISLAND
26590509 Street Light Condition BROOKLYN
26591379 Harboring Bees/Wasps MANHATTAN
26594085 Noise - Street/Sidewalk MANHATTAN
26589201 Street Light Condition STATEN ISLAND
... ... ...
26428764 Noise - Commercial MANHATTAN
26426166 Noise - Commercial MANHATTAN
26438565 Noise BROOKLYN
26428990 Noise - Street/Sidewalk MANHATTAN
26432659 Noise - Commercial BROOKLYN
26426096 Noise - Street/Sidewalk MANHATTAN
26437764 Water System MANHATTAN
26436286 Noise BROOKLYN
26428989 Illegal Parking QUEENS
26430030 Noise - Street/Sidewalk MANHATTAN
26429663 Noise - Commercial BROOKLYN
26437763 Noise MANHATTAN
26432955 Noise - Commercial MANHATTAN
26437035 Water System MANHATTAN
26433197 Derelict Vehicles QUEENS
26426060 Noise - Street/Sidewalk BROOKLYN
26430628 Noise - Commercial BROOKLYN
26431648 Street Sign - Missing QUEENS
26437034 Noise MANHATTAN
26426094 Noise - Commercial BROOKLYN
26429040 Noise - Street/Sidewalk MANHATTAN
26434084 Noise MANHATTAN
26426164 Noise - Commercial QUEENS
26439710 Water System MANHATTAN
26435569 Water System MANHATTAN
26426013 Maintenance or Facility BROOKLYN
26428083 Illegal Parking QUEENS
26428987 Noise - Street/Sidewalk MANHATTAN
26426115 Noise - Commercial BROOKLYN
26428033 Blocked Driveway BROOKLYN

111069 rows × 2 columns

We can access an individual row using its label:

In [10]:
complaints.loc[26589651]
Out[10]:
Created Date                                       10/31/2013 02:08:41 AM
Closed Date                                                           NaN
Agency                                                               NYPD
Agency Name                               New York City Police Department
Complaint Type                                    Noise - Street/Sidewalk
Descriptor                                                   Loud Talking
Location Type                                             Street/Sidewalk
Incident Zip                                                        11432
Incident Address                                         90-03 169 STREET
Street Name                                                    169 STREET
Cross Street 1                                                  90 AVENUE
Cross Street 2                                                  91 AVENUE
Intersection Street 1                                                 NaN
Intersection Street 2                                                 NaN
Address Type                                                      ADDRESS
City                                                              JAMAICA
Landmark                                                              NaN
Facility Type                                                    Precinct
Status                                                           Assigned
Due Date                                           10/31/2013 10:08:41 AM
Resolution Action Updated Date                     10/31/2013 02:35:17 AM
Community Board                                                 12 QUEENS
Borough                                                            QUEENS
X Coordinate (State Plane)                                    1.04203e+06
Y Coordinate (State Plane)                                         197389
Park Facility Name                                            Unspecified
Park Borough                                                       QUEENS
School Name                                                   Unspecified
School Number                                                 Unspecified
School Region                                                 Unspecified
School Code                                                   Unspecified
School Phone Number                                           Unspecified
School Address                                                Unspecified
School City                                                   Unspecified
School State                                                  Unspecified
School Zip                                                    Unspecified
School Not Found                                                        N
School or Citywide Complaint                                          NaN
Vehicle Type                                                          NaN
Taxi Company Borough                                                  NaN
Taxi Pick Up Location                                                 NaN
Bridge Highway Name                                                   NaN
Bridge Highway Direction                                              NaN
Road Ramp                                                             NaN
Bridge Highway Segment                                                NaN
Garage Lot Name                                                       NaN
Ferry Direction                                                       NaN
Ferry Terminal Name                                                   NaN
Latitude                                                          40.7083
Longitude                                                        -73.7916
Location                          (40.70827532593202, -73.79160395779721)
Name: 26589651, dtype: object

Or by its row number:

In [11]:
complaints.iloc[0]
Out[11]:
Created Date                                       10/31/2013 02:08:41 AM
Closed Date                                                           NaN
Agency                                                               NYPD
Agency Name                               New York City Police Department
Complaint Type                                    Noise - Street/Sidewalk
Descriptor                                                   Loud Talking
Location Type                                             Street/Sidewalk
Incident Zip                                                        11432
Incident Address                                         90-03 169 STREET
Street Name                                                    169 STREET
Cross Street 1                                                  90 AVENUE
Cross Street 2                                                  91 AVENUE
Intersection Street 1                                                 NaN
Intersection Street 2                                                 NaN
Address Type                                                      ADDRESS
City                                                              JAMAICA
Landmark                                                              NaN
Facility Type                                                    Precinct
Status                                                           Assigned
Due Date                                           10/31/2013 10:08:41 AM
Resolution Action Updated Date                     10/31/2013 02:35:17 AM
Community Board                                                 12 QUEENS
Borough                                                            QUEENS
X Coordinate (State Plane)                                    1.04203e+06
Y Coordinate (State Plane)                                         197389
Park Facility Name                                            Unspecified
Park Borough                                                       QUEENS
School Name                                                   Unspecified
School Number                                                 Unspecified
School Region                                                 Unspecified
School Code                                                   Unspecified
School Phone Number                                           Unspecified
School Address                                                Unspecified
School City                                                   Unspecified
School State                                                  Unspecified
School Zip                                                    Unspecified
School Not Found                                                        N
School or Citywide Complaint                                          NaN
Vehicle Type                                                          NaN
Taxi Company Borough                                                  NaN
Taxi Pick Up Location                                                 NaN
Bridge Highway Name                                                   NaN
Bridge Highway Direction                                              NaN
Road Ramp                                                             NaN
Bridge Highway Segment                                                NaN
Garage Lot Name                                                       NaN
Ferry Direction                                                       NaN
Ferry Terminal Name                                                   NaN
Latitude                                                          40.7083
Longitude                                                        -73.7916
Location                          (40.70827532593202, -73.79160395779721)
Name: 26589651, dtype: object

Quick aside: if we don't specify an index column, then row number becomes the index column.

In [12]:
complaints_no_index = pd.read_csv('311-service-requests.csv')
In [13]:
complaints_no_index[:5]
Out[13]:
Unique Key Created Date Closed Date Agency Agency Name Complaint Type Descriptor Location Type Incident Zip Incident Address ... Bridge Highway Name Bridge Highway Direction Road Ramp Bridge Highway Segment Garage Lot Name Ferry Direction Ferry Terminal Name Latitude Longitude Location
0 26589651 10/31/2013 02:08:41 AM NaN NYPD New York City Police Department Noise - Street/Sidewalk Loud Talking Street/Sidewalk 11432 90-03 169 STREET ... NaN NaN NaN NaN NaN NaN NaN 40.708275 -73.791604 (40.70827532593202, -73.79160395779721)
1 26593698 10/31/2013 02:01:04 AM NaN NYPD New York City Police Department Illegal Parking Commercial Overnight Parking Street/Sidewalk 11378 58 AVENUE ... NaN NaN NaN NaN NaN NaN NaN 40.721041 -73.909453 (40.721040535628305, -73.90945306791765)
2 26594139 10/31/2013 02:00:24 AM 10/31/2013 02:40:32 AM NYPD New York City Police Department Noise - Commercial Loud Music/Party Club/Bar/Restaurant 10032 4060 BROADWAY ... NaN NaN NaN NaN NaN NaN NaN 40.843330 -73.939144 (40.84332975466513, -73.93914371913482)
3 26595721 10/31/2013 01:56:23 AM 10/31/2013 02:21:48 AM NYPD New York City Police Department Noise - Vehicle Car/Truck Horn Street/Sidewalk 10023 WEST 72 STREET ... NaN NaN NaN NaN NaN NaN NaN 40.778009 -73.980213 (40.7780087446372, -73.98021349023975)
4 26590930 10/31/2013 01:53:44 AM NaN DOHMH Department of Health and Mental Hygiene Rodent Condition Attracting Rodents Vacant Lot 10027 WEST 124 STREET ... NaN NaN NaN NaN NaN NaN NaN 40.807691 -73.947387 (40.80769092704951, -73.94738703491433)

5 rows × 52 columns

In [14]:
complaints_no_index[['Complaint Type', 'Borough']]
Out[14]:
Complaint Type Borough
0 Noise - Street/Sidewalk QUEENS
1 Illegal Parking QUEENS
2 Noise - Commercial MANHATTAN
3 Noise - Vehicle MANHATTAN
4 Rodent MANHATTAN
5 Noise - Commercial QUEENS
6 Blocked Driveway QUEENS
7 Noise - Commercial QUEENS
8 Noise - Commercial MANHATTAN
9 Noise - Commercial BROOKLYN
10 Noise - House of Worship BROOKLYN
11 Noise - Commercial MANHATTAN
12 Illegal Parking MANHATTAN
13 Noise - Vehicle BRONX
14 Rodent BROOKLYN
15 Noise - House of Worship MANHATTAN
16 Noise - Street/Sidewalk STATEN ISLAND
17 Illegal Parking BROOKLYN
18 Street Light Condition BROOKLYN
19 Noise - Commercial MANHATTAN
20 Noise - House of Worship BROOKLYN
21 Noise - Commercial MANHATTAN
22 Noise - Vehicle QUEENS
23 Noise - Commercial BROOKLYN
24 Blocked Driveway STATEN ISLAND
25 Noise - Street/Sidewalk STATEN ISLAND
26 Street Light Condition BROOKLYN
27 Harboring Bees/Wasps MANHATTAN
28 Noise - Street/Sidewalk MANHATTAN
29 Street Light Condition STATEN ISLAND
... ... ...
111039 Noise - Commercial MANHATTAN
111040 Noise - Commercial MANHATTAN
111041 Noise BROOKLYN
111042 Noise - Street/Sidewalk MANHATTAN
111043 Noise - Commercial BROOKLYN
111044 Noise - Street/Sidewalk MANHATTAN
111045 Water System MANHATTAN
111046 Noise BROOKLYN
111047 Illegal Parking QUEENS
111048 Noise - Street/Sidewalk MANHATTAN
111049 Noise - Commercial BROOKLYN
111050 Noise MANHATTAN
111051 Noise - Commercial MANHATTAN
111052 Water System MANHATTAN
111053 Derelict Vehicles QUEENS
111054 Noise - Street/Sidewalk BROOKLYN
111055 Noise - Commercial BROOKLYN
111056 Street Sign - Missing QUEENS
111057 Noise MANHATTAN
111058 Noise - Commercial BROOKLYN
111059 Noise - Street/Sidewalk MANHATTAN
111060 Noise MANHATTAN
111061 Noise - Commercial QUEENS
111062 Water System MANHATTAN
111063 Water System MANHATTAN
111064 Maintenance or Facility BROOKLYN
111065 Illegal Parking QUEENS
111066 Noise - Street/Sidewalk MANHATTAN
111067 Noise - Commercial BROOKLYN
111068 Blocked Driveway BROOKLYN

111069 rows × 2 columns

In [15]:
complaints_no_index.loc[0]
Out[15]:
Unique Key                                                       26589651
Created Date                                       10/31/2013 02:08:41 AM
Closed Date                                                           NaN
Agency                                                               NYPD
Agency Name                               New York City Police Department
Complaint Type                                    Noise - Street/Sidewalk
Descriptor                                                   Loud Talking
Location Type                                             Street/Sidewalk
Incident Zip                                                        11432
Incident Address                                         90-03 169 STREET
Street Name                                                    169 STREET
Cross Street 1                                                  90 AVENUE
Cross Street 2                                                  91 AVENUE
Intersection Street 1                                                 NaN
Intersection Street 2                                                 NaN
Address Type                                                      ADDRESS
City                                                              JAMAICA
Landmark                                                              NaN
Facility Type                                                    Precinct
Status                                                           Assigned
Due Date                                           10/31/2013 10:08:41 AM
Resolution Action Updated Date                     10/31/2013 02:35:17 AM
Community Board                                                 12 QUEENS
Borough                                                            QUEENS
X Coordinate (State Plane)                                    1.04203e+06
Y Coordinate (State Plane)                                         197389
Park Facility Name                                            Unspecified
Park Borough                                                       QUEENS
School Name                                                   Unspecified
School Number                                                 Unspecified
School Region                                                 Unspecified
School Code                                                   Unspecified
School Phone Number                                           Unspecified
School Address                                                Unspecified
School City                                                   Unspecified
School State                                                  Unspecified
School Zip                                                    Unspecified
School Not Found                                                        N
School or Citywide Complaint                                          NaN
Vehicle Type                                                          NaN
Taxi Company Borough                                                  NaN
Taxi Pick Up Location                                                 NaN
Bridge Highway Name                                                   NaN
Bridge Highway Direction                                              NaN
Road Ramp                                                             NaN
Bridge Highway Segment                                                NaN
Garage Lot Name                                                       NaN
Ferry Direction                                                       NaN
Ferry Terminal Name                                                   NaN
Latitude                                                          40.7083
Longitude                                                        -73.7916
Location                          (40.70827532593202, -73.79160395779721)
Name: 0, dtype: object

Task 1: Top 10 complaint types

--> Basic operations on DataFrames: http://pandas.pydata.org/pandas-docs/stable/basics.html

In [25]:
complaints['Complaint Type']
Out[25]:
Unique Key
26589651     Noise - Street/Sidewalk
26593698             Illegal Parking
26594139          Noise - Commercial
26595721             Noise - Vehicle
26590930                      Rodent
26592370          Noise - Commercial
26595682            Blocked Driveway
26595195          Noise - Commercial
26590540          Noise - Commercial
26594392          Noise - Commercial
26595176    Noise - House of Worship
26591982          Noise - Commercial
26594169             Illegal Parking
26594391             Noise - Vehicle
26590917                      Rodent
26591458    Noise - House of Worship
26594086     Noise - Street/Sidewalk
26595117             Illegal Parking
26590389      Street Light Condition
26594210          Noise - Commercial
26592932    Noise - House of Worship
26594152          Noise - Commercial
26589678             Noise - Vehicle
26592304          Noise - Commercial
26591892            Blocked Driveway
26591573     Noise - Street/Sidewalk
26590509      Street Light Condition
26591379        Harboring Bees/Wasps
26594085     Noise - Street/Sidewalk
26589201      Street Light Condition
                      ...           
26428764          Noise - Commercial
26426166          Noise - Commercial
26438565                       Noise
26428990     Noise - Street/Sidewalk
26432659          Noise - Commercial
26426096     Noise - Street/Sidewalk
26437764                Water System
26436286                       Noise
26428989             Illegal Parking
26430030     Noise - Street/Sidewalk
26429663          Noise - Commercial
26437763                       Noise
26432955          Noise - Commercial
26437035                Water System
26433197           Derelict Vehicles
26426060     Noise - Street/Sidewalk
26430628          Noise - Commercial
26431648       Street Sign - Missing
26437034                       Noise
26426094          Noise - Commercial
26429040     Noise - Street/Sidewalk
26434084                       Noise
26426164          Noise - Commercial
26439710                Water System
26435569                Water System
26426013     Maintenance or Facility
26428083             Illegal Parking
26428987     Noise - Street/Sidewalk
26426115          Noise - Commercial
26428033            Blocked Driveway
Name: Complaint Type, dtype: object

The value_counts method will tally up the number of times a value appears in a column, and will return a Series with the counts, in descending order.

In [17]:
ctcounts = complaints['Complaint Type'].value_counts()
In [18]:
ctcounts
Out[18]:
HEATING                                 14200
GENERAL CONSTRUCTION                     7471
Street Light Condition                   7117
DOF Literature Request                   5797
PLUMBING                                 5373
PAINT - PLASTER                          5149
Blocked Driveway                         4590
NONCONST                                 3998
Street Condition                         3473
Illegal Parking                          3343
Noise                                    3321
Traffic Signal Condition                 3145
Dirty Conditions                         2653
Water System                             2636
Noise - Commercial                       2578
ELECTRIC                                 2350
Broken Muni Meter                        2070
Noise - Street/Sidewalk                  1928
Sanitation Condition                     1824
Rodent                                   1632
Sewer                                    1627
Consumer Complaint                       1227
Taxi Complaint                           1227
Damaged Tree                             1180
Overgrown Tree/Branches                  1083
Graffiti                                  973
Missed Collection (All Materials)         973
Building/Use                              942
Root/Sewer/Sidewalk Condition             836
Derelict Vehicle                          803
                                        ...  
Miscellaneous Categories                    5
Posting Advertisement                       5
Internal Code                               5
Poison Ivy                                  5
Fire Alarm - Modification                   5
Special Natural Area District (SNAD)        4
Transportation Provider Complaint           4
Illegal Animal Sold                         4
Ferry Complaint                             4
Invitation                                  3
Fire Alarm - Replacement                    3
Illegal Fireworks                           3
Adopt-A-Basket                              3
Window Guard                                2
Misc. Comments                              2
Opinion for the Mayor                       2
Public Assembly                             2
Legal Services Provider Complaint           2
DFTA Literature Request                     2
Highway Sign - Damaged                      1
Open Flame Permit                           1
Stalled Sites                               1
Ferry Permit                                1
Trans Fat                                   1
Municipal Parking Facility                  1
DWD                                         1
DHS Income Savings Requirement              1
Snow                                        1
X-Ray Machine/Equipment                     1
Tunnel Condition                            1
Name: Complaint Type, dtype: int64
In [19]:
ctcounts[:10]
Out[19]:
HEATING                   14200
GENERAL CONSTRUCTION       7471
Street Light Condition     7117
DOF Literature Request     5797
PLUMBING                   5373
PAINT - PLASTER            5149
Blocked Driveway           4590
NONCONST                   3998
Street Condition           3473
Illegal Parking            3343
Name: Complaint Type, dtype: int64
In [20]:
%matplotlib inline
import matplotlib.pyplot as plt
In [21]:
ctcounts[:10].plot("bar")
Out[21]:
<matplotlib.axes._subplots.AxesSubplot at 0x7f2968113b70>
In [26]:
complaints['Latitude'].plot('hist', bins=50),
Out[26]:
(<matplotlib.axes._subplots.AxesSubplot at 0x7f29636c6908>,)
In [27]:
complaints.plot(kind='scatter', x='Longitude', y='Latitude')
Out[27]:
<matplotlib.axes._subplots.AxesSubplot at 0x7f2963033e80>

Task 2: Count noise complaints in a given borough

This is similar to fancy indexing in NumPy. We can apply operations to columns, including comparison operators:

In [28]:
is_noise = complaints['Complaint Type'] == "Noise - Street/Sidewalk"
In [29]:
is_noise
Out[29]:
Unique Key
26589651     True
26593698    False
26594139    False
26595721    False
26590930    False
26592370    False
26595682    False
26595195    False
26590540    False
26594392    False
26595176    False
26591982    False
26594169    False
26594391    False
26590917    False
26591458    False
26594086     True
26595117    False
26590389    False
26594210    False
26592932    False
26594152    False
26589678    False
26592304    False
26591892    False
26591573     True
26590509    False
26591379    False
26594085     True
26589201    False
            ...  
26428764    False
26426166    False
26438565    False
26428990     True
26432659    False
26426096     True
26437764    False
26436286    False
26428989    False
26430030     True
26429663    False
26437763    False
26432955    False
26437035    False
26433197    False
26426060     True
26430628    False
26431648    False
26437034    False
26426094    False
26429040     True
26434084    False
26426164    False
26439710    False
26435569    False
26426013    False
26428083    False
26428987     True
26426115    False
26428033    False
Name: Complaint Type, dtype: bool

This just returns a series of True/False values, but we can plug this into the square brackets to only get the rows that meet that condition:

In [30]:
noise_complaints = complaints[is_noise]
In [31]:
noise_complaints
Out[31]:
Created Date Closed Date Agency Agency Name Complaint Type Descriptor Location Type Incident Zip Incident Address Street Name ... Bridge Highway Name Bridge Highway Direction Road Ramp Bridge Highway Segment Garage Lot Name Ferry Direction Ferry Terminal Name Latitude Longitude Location
Unique Key
26589651 10/31/2013 02:08:41 AM NaN NYPD New York City Police Department Noise - Street/Sidewalk Loud Talking Street/Sidewalk 11432 90-03 169 STREET 169 STREET ... NaN NaN NaN NaN NaN NaN NaN 40.708275 -73.791604 (40.70827532593202, -73.79160395779721)
26594086 10/31/2013 12:54:03 AM 10/31/2013 02:16:39 AM NYPD New York City Police Department Noise - Street/Sidewalk Loud Music/Party Street/Sidewalk 10310 173 CAMPBELL AVENUE CAMPBELL AVENUE ... NaN NaN NaN NaN NaN NaN NaN 40.636182 -74.116150 (40.63618202176914, -74.1161500428337)
26591573 10/31/2013 12:35:18 AM 10/31/2013 02:41:35 AM NYPD New York City Police Department Noise - Street/Sidewalk Loud Talking Street/Sidewalk 10312 24 PRINCETON LANE PRINCETON LANE ... NaN NaN NaN NaN NaN NaN NaN 40.553421 -74.196743 (40.55342078716953, -74.19674315017886)
26594085 10/31/2013 12:32:08 AM NaN NYPD New York City Police Department Noise - Street/Sidewalk Loud Talking Street/Sidewalk 10026 121 WEST 116 STREET WEST 116 STREET ... NaN NaN NaN NaN NaN NaN NaN 40.802390 -73.950526 (40.80238950799943, -73.95052644123253)
26595564 10/31/2013 12:30:36 AM NaN NYPD New York City Police Department Noise - Street/Sidewalk Loud Music/Party Street/Sidewalk 11236 AVENUE J AVENUE J ... NaN NaN NaN NaN NaN NaN NaN 40.634104 -73.911055 (40.634103775951736, -73.91105541883589)
26590446 10/31/2013 12:11:58 AM 10/31/2013 01:54:38 AM NYPD New York City Police Department Noise - Street/Sidewalk Loud Music/Party Street/Sidewalk 10459 819 EAST 167 STREET EAST 167 STREET ... NaN NaN NaN NaN NaN NaN NaN 40.826623 -73.899965 (40.826622810177874, -73.8999653556452)
26595553 10/31/2013 12:05:10 AM 10/31/2013 02:43:43 AM NYPD New York City Police Department Noise - Street/Sidewalk Loud Talking Street/Sidewalk 11225 25 LEFFERTS AVENUE LEFFERTS AVENUE ... NaN NaN NaN NaN NaN NaN NaN 40.661793 -73.959934 (40.6617931276793, -73.95993363978067)
26590695 10/30/2013 11:40:52 PM NaN NYPD New York City Police Department Noise - Street/Sidewalk Loud Talking Street/Sidewalk 10027 449 WEST 125 STREET WEST 125 STREET ... NaN NaN NaN NaN NaN NaN NaN 40.812717 -73.955678 (40.81271725522785, -73.95567750031833)
26594653 10/30/2013 11:26:32 PM 10/31/2013 12:18:54 AM NYPD New York City Police Department Noise - Street/Sidewalk Loud Music/Party Street/Sidewalk 11222 NaN NaN ... NaN NaN NaN NaN NaN NaN NaN 40.724600 -73.954271 (40.724599563793525, -73.95427134534344)
26594760 10/30/2013 11:17:43 PM 10/31/2013 01:05:43 AM NYPD New York City Police Department Noise - Street/Sidewalk Loud Music/Party Street/Sidewalk 10005 67 WALL STREET WALL STREET ... NaN NaN NaN NaN NaN NaN NaN 40.705749 -74.008342 (40.70574909022401, -74.00834244292587)
26593220 10/30/2013 11:14:15 PM 10/31/2013 01:28:20 AM NYPD New York City Police Department Noise - Street/Sidewalk Loud Music/Party Street/Sidewalk 10012 2 SPRING STREET SPRING STREET ... NaN NaN NaN NaN NaN NaN NaN 40.721005 -73.993990 (40.721004620274186, -73.99398975732854)
26590217 10/30/2013 11:11:28 PM 10/31/2013 01:29:28 AM NYPD New York City Police Department Noise - Street/Sidewalk Loud Talking Street/Sidewalk 10003 24 EAST 11 STREET EAST 11 STREET ... NaN NaN NaN NaN NaN NaN NaN 40.733587 -73.994014 (40.73358658490167, -73.99401388034954)
26593483 10/30/2013 10:56:50 PM NaN NYPD New York City Police Department Noise - Street/Sidewalk Loud Talking Street/Sidewalk 11102 4-10 27 AVENUE 27 AVENUE ... NaN NaN NaN NaN NaN NaN NaN 40.774670 -73.933296 (40.77467000453749, -73.93329608172101)
26593680 10/30/2013 10:43:14 PM 10/30/2013 11:35:07 PM NYPD New York City Police Department Noise - Street/Sidewalk Loud Talking Street/Sidewalk 11414 160-37 78 STREET 78 STREET ... NaN NaN NaN NaN NaN NaN NaN 40.654524 -73.853361 (40.654523538423916, -73.85336063975151)
26593530 10/30/2013 10:42:13 PM 10/30/2013 11:58:00 PM NYPD New York City Police Department Noise - Street/Sidewalk Loud Talking Street/Sidewalk 10040 570 WEST 193 STREET WEST 193 STREET ... NaN NaN NaN NaN NaN NaN NaN 40.856599 -73.927781 (40.85659933835734, -73.92778065178295)
26591036 10/30/2013 10:33:59 PM 10/31/2013 02:35:20 AM NYPD New York City Police Department Noise - Street/Sidewalk Loud Talking Street/Sidewalk 10472 1207 BEACH AVENUE BEACH AVENUE ... NaN NaN NaN NaN NaN NaN NaN 40.830362 -73.866022 (40.83036235589997, -73.86602154214397)
26594799 10/30/2013 10:31:49 PM 10/31/2013 02:58:14 AM NYPD New York City Police Department Noise - Street/Sidewalk Loud Music/Party Street/Sidewalk 10003 17 IRVING PLACE IRVING PLACE ... NaN NaN NaN NaN NaN NaN NaN 40.734915 -73.988450 (40.734914622073454, -73.98844969804303)
26591076 10/30/2013 10:31:11 PM 10/31/2013 12:40:39 AM NYPD New York City Police Department Noise - Street/Sidewalk Loud Talking Street/Sidewalk 10452 1565 TOWNSEND AVENUE TOWNSEND AVENUE ... NaN NaN NaN NaN NaN NaN NaN 40.843426 -73.914263 (40.84342647770325, -73.91426327900784)
26589308 10/30/2013 10:24:07 PM 10/30/2013 11:31:52 PM NYPD New York City Police Department Noise - Street/Sidewalk Loud Talking Street/Sidewalk 10031 529 WEST 151 STREET WEST 151 STREET ... NaN NaN NaN NaN NaN NaN NaN 40.829434 -73.945791 (40.82943385321531, -73.94579075649952)
26592708 10/30/2013 10:18:13 PM NaN NYPD New York City Police Department Noise - Street/Sidewalk Loud Talking Street/Sidewalk 11365 67-22 PARSONS BOULEVARD PARSONS BOULEVARD ... NaN NaN NaN NaN NaN NaN NaN 40.733500 -73.810710 (40.73349978477908, -73.81071026404766)
26591992 10/30/2013 10:02:58 PM 10/30/2013 10:23:20 PM NYPD New York City Police Department Noise - Street/Sidewalk Loud Talking Street/Sidewalk 11218 DITMAS AVENUE DITMAS AVENUE ... NaN NaN NaN NaN NaN NaN NaN 40.636169 -73.972455 (40.63616876563881, -73.97245504682485)
26589309 10/30/2013 09:48:55 PM 10/30/2013 10:34:58 PM NYPD New York City Police Department Noise - Street/Sidewalk Loud Talking Street/Sidewalk 10452 TOWNSEND AVENUE TOWNSEND AVENUE ... NaN NaN NaN NaN NaN NaN NaN 40.843986 -73.913851 (40.84398609069672, -73.9138505321433)
26595576 10/30/2013 08:48:47 PM 10/30/2013 08:54:39 PM NYPD New York City Police Department Noise - Street/Sidewalk Loud Talking Street/Sidewalk 10452 1565 TOWNSEND AVENUE TOWNSEND AVENUE ... NaN NaN NaN NaN NaN NaN NaN 40.843567 -73.914386 (40.843566548976696, -73.91438598253538)
26594339 10/30/2013 08:40:17 PM 10/30/2013 09:07:04 PM NYPD New York City Police Department Noise - Street/Sidewalk Loud Music/Party Street/Sidewalk 10030 222 WEST 134 STREET WEST 134 STREET ... NaN NaN NaN NaN NaN NaN NaN 40.815016 -73.945123 (40.81501553603748, -73.94512331602647)
26594167 10/30/2013 08:38:25 PM 10/30/2013 10:26:28 PM NYPD New York City Police Department Noise - Street/Sidewalk Loud Music/Party Street/Sidewalk 11218 126 BEVERLY ROAD BEVERLY ROAD ... NaN NaN NaN NaN NaN NaN NaN 40.642922 -73.978762 (40.6429222774404, -73.97876175474585)
26590110 10/30/2013 08:32:13 PM 10/30/2013 11:58:00 PM NYPD New York City Police Department Noise - Street/Sidewalk Loud Talking Street/Sidewalk 11210 2640 NOSTRAND AVENUE NOSTRAND AVENUE ... NaN NaN NaN NaN NaN NaN NaN 40.620390 -73.945676 (40.620390414106915, -73.94567596310624)
26590005 10/30/2013 08:02:48 PM 10/30/2013 08:28:48 PM NYPD New York City Police Department Noise - Street/Sidewalk Loud Talking Street/Sidewalk 10452 1565 TOWNSEND AVENUE TOWNSEND AVENUE ... NaN NaN NaN NaN NaN NaN NaN 40.843426 -73.914263 (40.84342647770325, -73.91426327900784)
26594205 10/30/2013 07:44:49 PM 10/30/2013 10:28:33 PM NYPD New York City Police Department Noise - Street/Sidewalk Loud Talking Street/Sidewalk 10452 1565 TOWNSEND AVENUE TOWNSEND AVENUE ... NaN NaN NaN NaN NaN NaN NaN 40.843567 -73.914386 (40.843566548976696, -73.91438598253538)
26590556 10/30/2013 06:59:49 PM 10/30/2013 09:30:52 PM NYPD New York City Police Department Noise - Street/Sidewalk Loud Talking Street/Sidewalk 10022 NaN NaN ... NaN NaN NaN NaN NaN NaN NaN 40.762179 -73.966129 (40.762179295833846, -73.96612906924031)
26593218 10/30/2013 06:24:42 PM 10/30/2013 07:31:54 PM NYPD New York City Police Department Noise - Street/Sidewalk Loud Talking Street/Sidewalk 10452 1565 TOWNSEND AVENUE TOWNSEND AVENUE ... NaN NaN NaN NaN NaN NaN NaN 40.843567 -73.914386 (40.843566548976696, -73.91438598253538)
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
26434521 10/04/2013 03:46:11 AM 10/04/2013 05:26:12 AM NYPD New York City Police Department Noise - Street/Sidewalk Loud Music/Party Street/Sidewalk 10012 3 BLEEKER STREET BLEEKER STREET ... NaN NaN NaN NaN NaN NaN NaN 40.725311 -73.992478 (40.72531105665101, -73.99247769162362)
26433417 10/04/2013 03:25:00 AM 10/04/2013 07:13:56 AM NYPD New York City Police Department Noise - Street/Sidewalk Loud Talking Street/Sidewalk 10033 622 WEST 182 STREET WEST 182 STREET ... NaN NaN NaN NaN NaN NaN NaN 40.850150 -73.933932 (40.850150227792625, -73.9339324637959)
26432543 10/04/2013 02:31:55 AM 10/04/2013 06:38:00 AM NYPD New York City Police Department Noise - Street/Sidewalk Loud Talking Street/Sidewalk 11238 400 LINCOLN PLACE LINCOLN PLACE ... NaN NaN NaN NaN NaN NaN NaN 40.672872 -73.963294 (40.67287217154439, -73.96329404057829)
26427965 10/04/2013 02:29:50 AM 10/04/2013 06:37:42 AM NYPD New York City Police Department Noise - Street/Sidewalk Loud Music/Party Street/Sidewalk 11238 840 WASHINGTON AVE WASHINGTON AVE ... NaN NaN NaN NaN NaN NaN NaN 40.672935 -73.963128 (40.672935248350534, -73.96312817637728)
26428058 10/04/2013 02:27:21 AM 10/04/2013 06:37:04 AM NYPD New York City Police Department Noise - Street/Sidewalk Loud Talking Street/Sidewalk 11238 407 LINCOLN PLACE LINCOLN PLACE ... NaN NaN NaN NaN NaN NaN NaN 40.672831 -73.963024 (40.67283091333569, -73.96302368954532)
26426116 10/04/2013 02:27:09 AM 10/04/2013 04:03:08 AM NYPD New York City Police Department Noise - Street/Sidewalk Loud Music/Party Street/Sidewalk 10458 2311 CROTONA AVENUE CROTONA AVENUE ... NaN NaN NaN NaN NaN NaN NaN 40.852362 -73.884423 (40.85236190814451, -73.88442312851471)
26430616 10/04/2013 02:23:35 AM 10/04/2013 04:24:51 AM NYPD New York City Police Department Noise - Street/Sidewalk Loud Talking Street/Sidewalk 10026 121 WEST 116 STREET WEST 116 STREET ... NaN NaN NaN NaN NaN NaN NaN 40.802390 -73.950526 (40.80238950799943, -73.95052644123253)
26426026 10/04/2013 02:15:12 AM 10/04/2013 06:36:54 AM NYPD New York City Police Department Noise - Street/Sidewalk Loud Talking Street/Sidewalk 11238 NaN NaN ... NaN NaN NaN NaN NaN NaN NaN 40.672600 -73.962753 (40.672600264989946, -73.96275344461327)
26428931 10/04/2013 02:10:04 AM 10/04/2013 03:02:08 AM NYPD New York City Police Department Noise - Street/Sidewalk Loud Talking Street/Sidewalk 10016 EAST 28 STREET EAST 28 STREET ... NaN NaN NaN NaN NaN NaN NaN 40.742308 -73.981660 (40.74230810331397, -73.98166037217192)
26427966 10/04/2013 01:47:46 AM 10/04/2013 04:42:19 AM NYPD New York City Police Department Noise - Street/Sidewalk Loud Talking Street/Sidewalk 11230 630 RUGBY ROAD RUGBY ROAD ... NaN NaN NaN NaN NaN NaN NaN 40.634234 -73.964065 (40.63423411912724, -73.96406469568188)
26428936 10/04/2013 01:35:41 AM 10/04/2013 06:48:33 AM NYPD New York City Police Department Noise - Street/Sidewalk Loud Talking Street/Sidewalk 10027 NaN NaN ... NaN NaN NaN NaN NaN NaN NaN 40.815527 -73.959718 (40.81552658739026, -73.95971825500413)
26428984 10/04/2013 01:30:27 AM 10/04/2013 02:49:38 AM NYPD New York City Police Department Noise - Street/Sidewalk Loud Music/Party Street/Sidewalk 11412 NaN NaN ... NaN NaN NaN NaN NaN NaN NaN 40.707013 -73.765790 (40.70701250870174, -73.76579012401083)
26429664 10/04/2013 01:18:37 AM 10/04/2013 03:24:57 AM NYPD New York City Police Department Noise - Street/Sidewalk Loud Music/Party Street/Sidewalk 10459 973 UNION AVENUE UNION AVENUE ... NaN NaN NaN NaN NaN NaN NaN 40.823432 -73.901585 (40.82343209751158, -73.90158522761199)
26431596 10/04/2013 01:16:01 AM 10/04/2013 02:02:20 AM NYPD New York City Police Department Noise - Street/Sidewalk Loud Talking Street/Sidewalk 11692 55-05 BEACH CHANNEL DRIVE BEACH CHANNEL DRIVE ... NaN NaN NaN NaN NaN NaN NaN 40.594979 -73.785833 (40.594979009156205, -73.7858326072286)
26425971 10/04/2013 01:13:24 AM 10/04/2013 06:36:46 AM NYPD New York City Police Department Noise - Street/Sidewalk Loud Talking Street/Sidewalk 10467 375 EAST MOSHOLU PARKWAY NORTH EAST MOSHOLU PARKWAY NORTH ... NaN NaN NaN NaN NaN NaN NaN 40.870662 -73.880049 (40.870661854738415, -73.8800488518787)
26431562 10/04/2013 01:02:45 AM 10/04/2013 04:43:24 AM NYPD New York City Police Department Noise - Street/Sidewalk Loud Music/Party Street/Sidewalk 11230 630 RUGBY ROAD RUGBY ROAD ... NaN NaN NaN NaN NaN NaN NaN 40.634234 -73.964065 (40.63423411912724, -73.96406469568188)
26431994 10/04/2013 12:59:57 AM 10/04/2013 04:43:24 AM NYPD New York City Police Department Noise - Street/Sidewalk Loud Talking Street/Sidewalk 11230 630 RUGBY ROAD RUGBY ROAD ... NaN NaN NaN NaN NaN NaN NaN 40.634234 -73.964065 (40.63423411912724, -73.96406469568188)
26431930 10/04/2013 12:55:46 AM 10/04/2013 07:12:52 AM NYPD New York City Police Department Noise - Street/Sidewalk Loud Talking Street/Sidewalk 10040 108 ELLWOOD STREET ELLWOOD STREET ... NaN NaN NaN NaN NaN NaN NaN 40.861829 -73.929456 (40.8618290295582, -73.9294560541168)
26431874 10/04/2013 12:55:42 AM 10/04/2013 02:44:20 AM NYPD New York City Police Department Noise - Street/Sidewalk Loud Talking Street/Sidewalk 10027 213 W 131ST ST W 131ST ST ... NaN NaN NaN NaN NaN NaN NaN 40.813284 -73.946252 (40.81328414856855, -73.94625187216151)
26430877 10/04/2013 12:52:45 AM 10/04/2013 07:07:42 AM NYPD New York City Police Department Noise - Street/Sidewalk Loud Talking Street/Sidewalk 10456 1314 GRAND CONCOURSE GRAND CONCOURSE ... NaN NaN NaN NaN NaN NaN NaN 40.836664 -73.915584 (40.836664485984976, -73.91558382507789)
26432669 10/04/2013 12:42:45 AM 10/04/2013 06:48:20 AM NYPD New York City Police Department Noise - Street/Sidewalk Loud Talking Street/Sidewalk 11249 125 TAYLOR STREET TAYLOR STREET ... NaN NaN NaN NaN NaN NaN NaN 40.705733 -73.963381 (40.70573258854396, -73.96338054284558)
26430709 10/04/2013 12:23:47 AM 10/04/2013 02:45:23 AM NYPD New York City Police Department Noise - Street/Sidewalk Loud Talking Street/Sidewalk 10030 145 WEST 138 STREET WEST 138 STREET ... NaN NaN NaN NaN NaN NaN NaN 40.816597 -73.940783 (40.81659709236496, -73.94078307354462)
26427751 10/04/2013 12:22:50 AM 10/04/2013 01:03:44 AM NYPD New York City Police Department Noise - Street/Sidewalk Loud Talking Street/Sidewalk 10032 NaN NaN ... NaN NaN NaN NaN NaN NaN NaN 40.833488 -73.941578 (40.83348848222364, -73.94157750579586)
26431005 10/04/2013 12:21:04 AM 10/04/2013 01:50:26 AM NYPD New York City Police Department Noise - Street/Sidewalk Loud Music/Party Street/Sidewalk 11205 382 WILLOUGHBY AVENUE WILLOUGHBY AVENUE ... NaN NaN NaN NaN NaN NaN NaN 40.692956 -73.955350 (40.692955608285715, -73.95534962720475)
26428990 10/04/2013 12:15:52 AM 10/04/2013 12:44:52 AM NYPD New York City Police Department Noise - Street/Sidewalk Loud Talking Street/Sidewalk 10014 733 WASHINGTON STREET WASHINGTON STREET ... NaN NaN NaN NaN NaN NaN NaN 40.736304 -74.008299 (40.736303747410034, -74.00829935904578)
26426096 10/04/2013 12:14:09 AM 10/04/2013 01:03:46 AM NYPD New York City Police Department Noise - Street/Sidewalk Loud Talking Street/Sidewalk 10032 539 WEST 162 STREET WEST 162 STREET ... NaN NaN NaN NaN NaN NaN NaN 40.836532 -73.941018 (40.836532082987944, -73.9410182995914)
26430030 10/04/2013 12:12:07 AM 10/04/2013 02:45:24 AM NYPD New York City Police Department Noise - Street/Sidewalk Loud Talking Street/Sidewalk 10027 215 WEST 131ST STREET WEST 131ST STREET ... NaN NaN NaN NaN NaN NaN NaN 40.813339 -73.946328 (40.81333907832113, -73.94632769228208)
26426060 10/04/2013 12:06:39 AM 10/04/2013 12:31:16 AM NYPD New York City Police Department Noise - Street/Sidewalk Loud Talking Street/Sidewalk 11224 4823 BEACH 48 STREET BEACH 48 STREET ... NaN NaN NaN NaN NaN NaN NaN 40.577515 -74.012207 (40.57751513866536, -74.01220705163807)
26429040 10/04/2013 12:04:52 AM 10/04/2013 03:01:04 AM NYPD New York City Police Department Noise - Street/Sidewalk Loud Talking Street/Sidewalk 10003 99 2 AVENUE 2 AVENUE ... NaN NaN NaN NaN NaN NaN NaN 40.727251 -73.988660 (40.727251287038676, -73.98866028603422)
26428987 10/04/2013 12:00:45 AM 10/04/2013 01:25:01 AM NYPD New York City Police Department Noise - Street/Sidewalk Loud Talking Street/Sidewalk 10016 344 EAST 28 STREET EAST 28 STREET ... NaN NaN NaN NaN NaN NaN NaN 40.740295 -73.976952 (40.740295354643706, -73.97695165980414)

1928 rows × 51 columns

In [32]:
noise_complaints.shape
Out[32]:
(1928, 51)

We can combine multiple conditions with & and |

In [33]:
is_brooklyn = complaints['Borough'] == "BROOKLYN"
In [34]:
noise_complaints = complaints[is_noise & is_brooklyn]
In [35]:
noise_complaints.shape
Out[35]:
(456, 51)

Now, it turns out that, in this dataset, the "Complaint Type" column can contain many different types of noise complaints:

In [36]:
complaints['Complaint Type'].unique()
Out[36]:
array(['Noise - Street/Sidewalk', 'Illegal Parking', 'Noise - Commercial',
       'Noise - Vehicle', 'Rodent', 'Blocked Driveway',
       'Noise - House of Worship', 'Street Light Condition',
       'Harboring Bees/Wasps', 'Taxi Complaint', 'Homeless Encampment',
       'Traffic Signal Condition', 'Food Establishment', 'Noise - Park',
       'Broken Muni Meter', 'Benefit Card Replacement',
       'Sanitation Condition', 'ELECTRIC', 'PLUMBING', 'HEATING',
       'GENERAL CONSTRUCTION', 'Street Condition', 'Consumer Complaint',
       'Derelict Vehicles', 'Noise', 'Drinking', 'Indoor Air Quality',
       'Panhandling', 'Derelict Vehicle', 'Lead', 'Water System',
       'Noise - Helicopter', 'Homeless Person Assistance',
       'Root/Sewer/Sidewalk Condition', 'Sidewalk Condition', 'Graffiti',
       'DOF Literature Request', 'Animal in a Park',
       'Overgrown Tree/Branches', 'Air Quality', 'Dirty Conditions',
       'Water Quality', 'Other Enforcement', 'Collection Truck Noise',
       'Illegal Tree Damage', 'Litter Basket / Request',
       'Violation of Park Rules', 'Maintenance or Facility', 'Sewer',
       'Agency Issues', 'Beach/Pool/Sauna Complaint',
       'Bike/Roller/Skate Chronic', 'School Maintenance', 'Animal Abuse',
       'Water Conservation', 'Damaged Tree', 'Street Sign - Damaged',
       'Taxi Compliment', 'Overflowing Litter Baskets', 'Asbestos',
       'Indoor Sewage', 'Hazardous Materials', 'Dead Tree', 'Smoking',
       'Missed Collection (All Materials)', 'Standing Water',
       'Sweeping/Missed-Inadequate',
       'DCA / DOH New License Application Request', 'Ferry Inquiry',
       'For Hire Vehicle Complaint', 'Traffic', 'Plumbing', 'Building/Use',
       'SCRIE', 'Boilers', 'General Construction/Plumbing', 'Construction',
       'Special Projects Inspection Team (SPIT)', 'Elevator',
       'DOT Literature Request', 'Fire Safety Director - F58',
       'Highway Condition', 'Street Sign - Missing',
       'Investigations and Discipline (IAD)', 'Non-Residential Heat',
       'Electrical', 'DOF Parking - Tax Exemption', 'Special Enforcement',
       'Street Sign - Dangling', 'Internal Code',
       'Unsanitary Animal Pvt Property', 'Derelict Bicycle',
       'Industrial Waste', 'Vacant Lot', 'Food Poisoning',
       'Found Property', 'Bridge Condition', 'Bike Rack Condition',
       'BEST/Site Safety', 'EAP Inspection - F59', 'Vending',
       'Public Assembly', 'Public Payphone Complaint', 'Unleashed Dog',
       'Curb Condition', 'Cranes and Derricks', 'DCA Literature Request',
       'City Vehicle Placard Complaint',
       'Legal Services Provider Complaint', 'Posting Advertisement',
       'PAINT - PLASTER', 'NONCONST', 'APPLIANCE', 'CONSTRUCTION',
       'Recycling Enforcement', 'Special Natural Area District (SNAD)',
       'DPR Internal', 'Opinion for the Mayor', 'Mold', 'Snow',
       'Complaint', 'Urinating in Public', 'Senior Center Complaint',
       'Animal Facility - No Permit', 'Unsanitary Animal Facility',
       'Parking Card', 'Fire Alarm - Reinspection',
       'Request for Information', 'Miscellaneous Categories',
       'Unsanitary Pigeon Condition', 'Plant', 'Fire Alarm - Replacement',
       'DPR Literature Request', 'Invitation', 'Scaffold Safety',
       'Transportation Provider Complaint', 'Ferry Complaint',
       'STRUCTURAL', 'Bus Stop Shelter Placement', 'Compliment',
       'Misc. Comments', 'Disorderly Youth', 'OEM Literature Request',
       'Fire Alarm - New System', 'Fire Alarm - Modification',
       'Stalled Sites', 'Public Toilet', 'Tattooing',
       'Illegal Animal Sold', 'Illegal Animal Kept as Pet',
       'DFTA Literature Request', 'Adopt-A-Basket',
       'DHS Income Savings Requirement', 'Poison Ivy', 'Tunnel Condition',
       'Drinking Water', 'Municipal Parking Facility', 'Trans Fat',
       'Illegal Fireworks', 'Window Guard', 'Open Flame Permit', 'DWD',
       'Highway Sign - Damaged', 'Ferry Permit', 'X-Ray Machine/Equipment'], dtype=object)

If a column contains text data, we can manipulate it (and filter by it) by accessing the str attribute of the Series:

In [37]:
ctypes = complaints['Complaint Type']
In [38]:
ctypes[ctypes.str.contains("Noise")].unique()
Out[38]:
array(['Noise - Street/Sidewalk', 'Noise - Commercial', 'Noise - Vehicle',
       'Noise - House of Worship', 'Noise - Park', 'Noise',
       'Noise - Helicopter', 'Collection Truck Noise'], dtype=object)
In [39]:
is_all_noise = complaints['Complaint Type'].str.contains("Noise")
in_brooklyn =  complaints['Borough'] == "BROOKLYN"
In [40]:
brooklyn_noise = complaints[is_all_noise & in_brooklyn]
In [41]:
brooklyn_noise.shape
Out[41]:
(2344, 51)
In [43]:
brooklyn_noise['Complaint Type'].value_counts()
Out[43]:
Noise - Commercial          775
Noise                       767
Noise - Street/Sidewalk     456
Noise - Vehicle             237
Noise - Park                 60
Noise - House of Worship     23
Noise - Helicopter           23
Collection Truck Noise        3
Name: Complaint Type, dtype: int64
In [87]:
import matplotlib.pyplot as plt

for ctype, count in ctcounts[:10].iteritems():
    df = complaints[complaints['Complaint Type'] == ctype]
    df.plot(kind='scatter', x='Longitude', y='Latitude', subplots=True, label=ctype, marker=".")

Task 3: What is the top complaint in each borough?

To do this task, we need to split the data into groups. First off, we need to separate the data by boroughs. We can do this with the groupby method. This returns a special DataFrameGroupBy object which we can do some operations on. For example, we can call its size method to obtain a Series where the row label is the column we're grouping by, and the values is the number of rows in that group.

In [130]:
complaints.groupby('Borough').size()
Out[130]:
Borough
BRONX            19686
BROOKLYN         32890
MANHATTAN        24288
QUEENS           22281
STATEN ISLAND     4817
Unspecified       7107
dtype: int64

We can also group by multiple columns:

In [69]:
gb = complaints.groupby(['Borough', 'Complaint Type'])
In [70]:
gb.size()
Out[70]:
Borough      Complaint Type                   
BRONX        APPLIANCE                            108
             Air Quality                           40
             Animal Abuse                         162
             Animal in a Park                      17
             Asbestos                              17
             BEST/Site Safety                      12
             Bike/Roller/Skate Chronic              2
             Blocked Driveway                     656
             Boilers                               17
             Bridge Condition                       4
             Broken Muni Meter                    121
             Building/Use                         102
             Bus Stop Shelter Placement             1
             CONSTRUCTION                          18
             Construction                           2
             Consumer Complaint                   155
             Curb Condition                        11
             DPR Internal                          10
             Damaged Tree                         105
             Dead Tree                             55
             Derelict Vehicle                      80
             Derelict Vehicles                     83
             Dirty Conditions                     475
             Disorderly Youth                       5
             Drinking                              17
             Drinking Water                         2
             EAP Inspection - F59                   1
             ELECTRIC                             619
             Electrical                             6
             Elevator                              75
                                                 ... 
Unspecified  Ferry Complaint                        4
             Ferry Inquiry                         32
             Ferry Permit                           1
             For Hire Vehicle Complaint             3
             Found Property                         1
             Highway Condition                      3
             Invitation                             3
             Legal Services Provider Complaint      2
             Maintenance or Facility                4
             Misc. Comments                         2
             Noise                                  5
             Noise - Street/Sidewalk                1
             Noise - Vehicle                        1
             OEM Literature Request                29
             Opinion for the Mayor                  2
             Parking Card                           8
             Request for Information               18
             SCRIE                                  9
             Sanitation Condition                   3
             School Maintenance                     8
             Senior Center Complaint                2
             Sewer                                  4
             Street Condition                       6
             Street Light Condition               112
             Taxi Complaint                        28
             Taxi Compliment                       73
             Traffic Signal Condition              10
             Transportation Provider Complaint      1
             Vacant Lot                             3
             Water System                           2
dtype: int64

This may look like a DataFrame (because it has multiple columns), but it's actually a Series with a hierarchical index. To work on it like a DataFrame, we need to convert it into one:

In [73]:
gb.size().to_frame()
Out[73]:
0
Borough Complaint Type
BRONX APPLIANCE 108
Air Quality 40
Animal Abuse 162
Animal in a Park 17
Asbestos 17
BEST/Site Safety 12
Bike/Roller/Skate Chronic 2
Blocked Driveway 656
Boilers 17
Bridge Condition 4
Broken Muni Meter 121
Building/Use 102
Bus Stop Shelter Placement 1
CONSTRUCTION 18
Construction 2
Consumer Complaint 155
Curb Condition 11
DPR Internal 10
Damaged Tree 105
Dead Tree 55
Derelict Vehicle 80
Derelict Vehicles 83
Dirty Conditions 475
Disorderly Youth 5
Drinking 17
Drinking Water 2
EAP Inspection - F59 1
ELECTRIC 619
Electrical 6
Elevator 75
... ... ...
Unspecified Ferry Complaint 4
Ferry Inquiry 32
Ferry Permit 1
For Hire Vehicle Complaint 3
Found Property 1
Highway Condition 3
Invitation 3
Legal Services Provider Complaint 2
Maintenance or Facility 4
Misc. Comments 2
Noise 5
Noise - Street/Sidewalk 1
Noise - Vehicle 1
OEM Literature Request 29
Opinion for the Mayor 2
Parking Card 8
Request for Information 18
SCRIE 9
Sanitation Condition 3
School Maintenance 8
Senior Center Complaint 2
Sewer 4
Street Condition 6
Street Light Condition 112
Taxi Complaint 28
Taxi Compliment 73
Traffic Signal Condition 10
Transportation Provider Complaint 1
Vacant Lot 3
Water System 2

638 rows × 1 columns

We can also 'flatten' the hierarchical index using reset_index:

In [75]:
counts = gb.size().to_frame().reset_index()
In [82]:
counts[:5]
Out[82]:
Borough Complaint Type Count
0 BRONX APPLIANCE 108
1 BRONX Air Quality 40
2 BRONX Animal Abuse 162
3 BRONX Animal in a Park 17
4 BRONX Asbestos 17

We have a column that has been named "0" by default. We can do one of two things. We can either explicitly rename the columns:

In [77]:
counts = counts.rename(columns={0 : "Count"})
In [81]:
counts[:5]
Out[81]:
Borough Complaint Type Count
0 BRONX APPLIANCE 108
1 BRONX Air Quality 40
2 BRONX Animal Abuse 162
3 BRONX Animal in a Park 17
4 BRONX Asbestos 17

Or we can just specify the column name when calling to_frame:

In [92]:
counts = gb.size().to_frame("Count").reset_index()
In [93]:
counts[:5]
Out[93]:
Borough Complaint Type Count
0 BRONX APPLIANCE 108
1 BRONX Air Quality 40
2 BRONX Animal Abuse 162
3 BRONX Animal in a Park 17
4 BRONX Asbestos 17

To compute the top complaint in each borough, we could just take the information we have in the above dataframe, and use regular Python.

First of all, we can iterate over the dataframe like this:

In [94]:
for i, row in counts.iterrows():
    print(i, row["Borough"], row["Complaint Type"], row["Count"])
0 BRONX APPLIANCE 108
1 BRONX Air Quality 40
2 BRONX Animal Abuse 162
3 BRONX Animal in a Park 17
4 BRONX Asbestos 17
5 BRONX BEST/Site Safety 12
6 BRONX Bike/Roller/Skate Chronic 2
7 BRONX Blocked Driveway 656
8 BRONX Boilers 17
9 BRONX Bridge Condition 4
10 BRONX Broken Muni Meter 121
11 BRONX Building/Use 102
12 BRONX Bus Stop Shelter Placement 1
13 BRONX CONSTRUCTION 18
14 BRONX Construction 2
15 BRONX Consumer Complaint 155
16 BRONX Curb Condition 11
17 BRONX DPR Internal 10
18 BRONX Damaged Tree 105
19 BRONX Dead Tree 55
20 BRONX Derelict Vehicle 80
21 BRONX Derelict Vehicles 83
22 BRONX Dirty Conditions 475
23 BRONX Disorderly Youth 5
24 BRONX Drinking 17
25 BRONX Drinking Water 2
26 BRONX EAP Inspection - F59 1
27 BRONX ELECTRIC 619
28 BRONX Electrical 6
29 BRONX Elevator 75
30 BRONX Fire Alarm - New System 1
31 BRONX Fire Alarm - Reinspection 1
32 BRONX Fire Safety Director - F58 2
33 BRONX Food Establishment 44
34 BRONX Food Poisoning 21
35 BRONX For Hire Vehicle Complaint 28
36 BRONX GENERAL CONSTRUCTION 2273
37 BRONX General Construction/Plumbing 93
38 BRONX Graffiti 170
39 BRONX HEATING 4093
40 BRONX Harboring Bees/Wasps 2
41 BRONX Hazardous Materials 20
42 BRONX Highway Condition 15
43 BRONX Highway Sign - Damaged 1
44 BRONX Homeless Encampment 17
45 BRONX Homeless Person Assistance 6
46 BRONX Illegal Animal Kept as Pet 2
47 BRONX Illegal Fireworks 2
48 BRONX Illegal Parking 391
49 BRONX Illegal Tree Damage 13
50 BRONX Indoor Air Quality 23
51 BRONX Indoor Sewage 14
52 BRONX Industrial Waste 37
53 BRONX Internal Code 1
54 BRONX Investigations and Discipline (IAD) 5
55 BRONX Lead 8
56 BRONX Litter Basket / Request 11
57 BRONX Maintenance or Facility 42
58 BRONX Miscellaneous Categories 1
59 BRONX Missed Collection (All Materials) 61
60 BRONX Mold 3
61 BRONX NONCONST 1134
62 BRONX Noise 168
63 BRONX Noise - Commercial 136
64 BRONX Noise - Helicopter 1
65 BRONX Noise - House of Worship 11
66 BRONX Noise - Park 9
67 BRONX Noise - Street/Sidewalk 292
68 BRONX Noise - Vehicle 102
69 BRONX Non-Residential Heat 28
70 BRONX Other Enforcement 19
71 BRONX Overflowing Litter Baskets 8
72 BRONX Overgrown Tree/Branches 123
73 BRONX PAINT - PLASTER 1653
74 BRONX PLUMBING 1591
75 BRONX Plant 1
76 BRONX Plumbing 14
77 BRONX Poison Ivy 1
78 BRONX Posting Advertisement 1
79 BRONX Public Assembly 1
80 BRONX Public Payphone Complaint 3
81 BRONX Recycling Enforcement 12
82 BRONX Rodent 456
83 BRONX Root/Sewer/Sidewalk Condition 56
84 BRONX SCRIE 78
85 BRONX STRUCTURAL 4
86 BRONX Sanitation Condition 227
87 BRONX Scaffold Safety 2
88 BRONX School Maintenance 17
89 BRONX Senior Center Complaint 3
90 BRONX Sewer 201
91 BRONX Sidewalk Condition 22
92 BRONX Smoking 12
93 BRONX Special Enforcement 18
94 BRONX Special Natural Area District (SNAD) 1
95 BRONX Special Projects Inspection Team (SPIT) 10
96 BRONX Standing Water 4
97 BRONX Street Condition 379
98 BRONX Street Light Condition 1642
99 BRONX Street Sign - Damaged 17
100 BRONX Street Sign - Dangling 12
101 BRONX Street Sign - Missing 51
102 BRONX Sweeping/Missed-Inadequate 12
103 BRONX Tattooing 1
104 BRONX Taxi Complaint 19
105 BRONX Traffic 21
106 BRONX Traffic Signal Condition 329
107 BRONX Transportation Provider Complaint 1
108 BRONX Unleashed Dog 7
109 BRONX Unsanitary Animal Pvt Property 21
110 BRONX Unsanitary Pigeon Condition 8
111 BRONX Urinating in Public 2
112 BRONX Vacant Lot 20
113 BRONX Vending 23
114 BRONX Violation of Park Rules 17
115 BRONX Water Conservation 15
116 BRONX Water Quality 13
117 BRONX Water System 278
118 BROOKLYN APPLIANCE 175
119 BROOKLYN Air Quality 125
120 BROOKLYN Animal Facility - No Permit 4
121 BROOKLYN Animal in a Park 15
122 BROOKLYN Asbestos 29
123 BROOKLYN BEST/Site Safety 16
124 BROOKLYN Beach/Pool/Sauna Complaint 6
125 BROOKLYN Bike Rack Condition 3
126 BROOKLYN Bike/Roller/Skate Chronic 3
127 BROOKLYN Blocked Driveway 1776
128 BROOKLYN Boilers 26
129 BROOKLYN Bridge Condition 9
130 BROOKLYN Broken Muni Meter 252
131 BROOKLYN Building/Use 256
132 BROOKLYN Bus Stop Shelter Placement 3
133 BROOKLYN CONSTRUCTION 28
134 BROOKLYN City Vehicle Placard Complaint 8
135 BROOKLYN Collection Truck Noise 3
136 BROOKLYN Construction 4
137 BROOKLYN Consumer Complaint 357
138 BROOKLYN Curb Condition 27
139 BROOKLYN DPR Internal 12
140 BROOKLYN Damaged Tree 300
141 BROOKLYN Dead Tree 156
142 BROOKLYN Derelict Bicycle 37
143 BROOKLYN Derelict Vehicle 226
144 BROOKLYN Derelict Vehicles 354
145 BROOKLYN Dirty Conditions 680
146 BROOKLYN Disorderly Youth 8
147 BROOKLYN Drinking 18
148 BROOKLYN Drinking Water 3
149 BROOKLYN EAP Inspection - F59 2
150 BROOKLYN ELECTRIC 987
151 BROOKLYN Electrical 27
152 BROOKLYN Elevator 50
153 BROOKLYN Fire Alarm - New System 1
154 BROOKLYN Fire Alarm - Reinspection 3
155 BROOKLYN Fire Safety Director - F58 10
156 BROOKLYN Food Establishment 117
157 BROOKLYN Food Poisoning 38
158 BROOKLYN For Hire Vehicle Complaint 42
159 BROOKLYN Found Property 2
160 BROOKLYN GENERAL CONSTRUCTION 2791
161 BROOKLYN General Construction/Plumbing 268
162 BROOKLYN Graffiti 468
163 BROOKLYN HEATING 4548
164 BROOKLYN Harboring Bees/Wasps 3
165 BROOKLYN Hazardous Materials 45
166 BROOKLYN Highway Condition 32
167 BROOKLYN Homeless Encampment 59
168 BROOKLYN Homeless Person Assistance 31
169 BROOKLYN Illegal Animal Kept as Pet 5
170 BROOKLYN Illegal Animal Sold 2
171 BROOKLYN Illegal Fireworks 1
172 BROOKLYN Illegal Parking 1147
173 BROOKLYN Illegal Tree Damage 44
174 BROOKLYN Indoor Air Quality 65
175 BROOKLYN Indoor Sewage 48
176 BROOKLYN Industrial Waste 47
177 BROOKLYN Investigations and Discipline (IAD) 12
178 BROOKLYN Lead 68
179 BROOKLYN Litter Basket / Request 92
180 BROOKLYN Maintenance or Facility 167
181 BROOKLYN Missed Collection (All Materials) 299
182 BROOKLYN Mold 9
183 BROOKLYN NONCONST 1500
184 BROOKLYN Noise 767
185 BROOKLYN Noise - Commercial 775
186 BROOKLYN Noise - Helicopter 23
187 BROOKLYN Noise - House of Worship 23
188 BROOKLYN Noise - Park 60
189 BROOKLYN Noise - Street/Sidewalk 456
190 BROOKLYN Noise - Vehicle 237
191 BROOKLYN Non-Residential Heat 24
192 BROOKLYN Other Enforcement 128
193 BROOKLYN Overflowing Litter Baskets 28
194 BROOKLYN Overgrown Tree/Branches 329
195 BROOKLYN PAINT - PLASTER 1821
196 BROOKLYN PLUMBING 2017
197 BROOKLYN Panhandling 1
198 BROOKLYN Plant 14
199 BROOKLYN Plumbing 21
200 BROOKLYN Poison Ivy 1
201 BROOKLYN Posting Advertisement 2
202 BROOKLYN Public Assembly 1
203 BROOKLYN Public Payphone Complaint 8
204 BROOKLYN Public Toilet 2
205 BROOKLYN Recycling Enforcement 15
206 BROOKLYN Rodent 467
207 BROOKLYN Root/Sewer/Sidewalk Condition 261
208 BROOKLYN SCRIE 149
209 BROOKLYN STRUCTURAL 4
210 BROOKLYN Sanitation Condition 716
211 BROOKLYN Scaffold Safety 1
212 BROOKLYN School Maintenance 26
213 BROOKLYN Senior Center Complaint 4
214 BROOKLYN Sewer 404
215 BROOKLYN Sidewalk Condition 108
216 BROOKLYN Smoking 26
217 BROOKLYN Special Enforcement 69
218 BROOKLYN Special Projects Inspection Team (SPIT) 25
219 BROOKLYN Standing Water 16
220 BROOKLYN Street Condition 1012
221 BROOKLYN Street Light Condition 1956
222 BROOKLYN Street Sign - Damaged 520
223 BROOKLYN Street Sign - Dangling 40
224 BROOKLYN Street Sign - Missing 153
225 BROOKLYN Sweeping/Missed-Inadequate 86
226 BROOKLYN Tattooing 1
227 BROOKLYN Taxi Complaint 159
228 BROOKLYN Traffic 46
229 BROOKLYN Traffic Signal Condition 1196
230 BROOKLYN Transportation Provider Complaint 1
231 BROOKLYN Unleashed Dog 6
232 BROOKLYN Unsanitary Animal Facility 4
233 BROOKLYN Unsanitary Animal Pvt Property 29
234 BROOKLYN Unsanitary Pigeon Condition 10
235 BROOKLYN Urinating in Public 12
236 BROOKLYN Vacant Lot 61
237 BROOKLYN Vending 35
238 BROOKLYN Violation of Park Rules 8
239 BROOKLYN Water Conservation 31
240 BROOKLYN Water Quality 28
241 BROOKLYN Water System 517
242 BROOKLYN Window Guard 1
243 MANHATTAN APPLIANCE 89
244 MANHATTAN Adopt-A-Basket 3
245 MANHATTAN Air Quality 203
246 MANHATTAN Animal Abuse 2
247 MANHATTAN Animal Facility - No Permit 2
248 MANHATTAN Animal in a Park 23
249 MANHATTAN Asbestos 60
250 MANHATTAN BEST/Site Safety 18
251 MANHATTAN Bike Rack Condition 4
252 MANHATTAN Bike/Roller/Skate Chronic 24
253 MANHATTAN Blocked Driveway 157
254 MANHATTAN Boilers 24
255 MANHATTAN Bridge Condition 3
256 MANHATTAN Broken Muni Meter 1312
257 MANHATTAN Building/Use 71
258 MANHATTAN Bus Stop Shelter Placement 5
259 MANHATTAN CONSTRUCTION 9
260 MANHATTAN City Vehicle Placard Complaint 11
261 MANHATTAN Collection Truck Noise 4
262 MANHATTAN Construction 8
263 MANHATTAN Consumer Complaint 358
264 MANHATTAN Cranes and Derricks 9
265 MANHATTAN Curb Condition 1
266 MANHATTAN DOF Parking - Tax Exemption 43
267 MANHATTAN DPR Internal 43
268 MANHATTAN Damaged Tree 54
269 MANHATTAN Dead Tree 24
270 MANHATTAN Derelict Bicycle 34
271 MANHATTAN Derelict Vehicle 29
272 MANHATTAN Derelict Vehicles 28
273 MANHATTAN Dirty Conditions 404
274 MANHATTAN Disorderly Youth 4
275 MANHATTAN Drinking 21
276 MANHATTAN Drinking Water 2
277 MANHATTAN EAP Inspection - F59 120
278 MANHATTAN ELECTRIC 333
279 MANHATTAN Electrical 28
280 MANHATTAN Elevator 70
281 MANHATTAN Fire Alarm - Modification 5
282 MANHATTAN Fire Alarm - Reinspection 7
283 MANHATTAN Fire Alarm - Replacement 2
284 MANHATTAN Fire Safety Director - F58 382
285 MANHATTAN Food Establishment 196
286 MANHATTAN Food Poisoning 56
287 MANHATTAN For Hire Vehicle Complaint 52
288 MANHATTAN Found Property 15
289 MANHATTAN GENERAL CONSTRUCTION 1333
290 MANHATTAN General Construction/Plumbing 157
291 MANHATTAN Graffiti 185
292 MANHATTAN HEATING 3494
293 MANHATTAN Harboring Bees/Wasps 2
294 MANHATTAN Hazardous Materials 58
295 MANHATTAN Highway Condition 4
296 MANHATTAN Homeless Encampment 170
297 MANHATTAN Homeless Person Assistance 54
298 MANHATTAN Illegal Animal Kept as Pet 2
299 MANHATTAN Illegal Parking 640
300 MANHATTAN Illegal Tree Damage 37
301 MANHATTAN Indoor Air Quality 122
302 MANHATTAN Indoor Sewage 16
303 MANHATTAN Industrial Waste 27
304 MANHATTAN Internal Code 2
305 MANHATTAN Investigations and Discipline (IAD) 9
306 MANHATTAN Lead 48
307 MANHATTAN Litter Basket / Request 68
308 MANHATTAN Maintenance or Facility 129
309 MANHATTAN Miscellaneous Categories 4
310 MANHATTAN Missed Collection (All Materials) 116
311 MANHATTAN Mold 6
312 MANHATTAN NONCONST 673
313 MANHATTAN Noise 1848
314 MANHATTAN Noise - Commercial 1140
315 MANHATTAN Noise - Helicopter 66
316 MANHATTAN Noise - House of Worship 16
317 MANHATTAN Noise - Park 91
318 MANHATTAN Noise - Street/Sidewalk 917
319 MANHATTAN Noise - Vehicle 255
320 MANHATTAN Non-Residential Heat 36
321 MANHATTAN Open Flame Permit 1
322 MANHATTAN Other Enforcement 91
323 MANHATTAN Overflowing Litter Baskets 32
324 MANHATTAN Overgrown Tree/Branches 36
325 MANHATTAN PAINT - PLASTER 971
326 MANHATTAN PLUMBING 961
327 MANHATTAN Panhandling 6
328 MANHATTAN Plant 3
329 MANHATTAN Plumbing 8
330 MANHATTAN Posting Advertisement 1
331 MANHATTAN Public Payphone Complaint 15
332 MANHATTAN Public Toilet 1
333 MANHATTAN Recycling Enforcement 15
334 MANHATTAN Rodent 372
335 MANHATTAN Root/Sewer/Sidewalk Condition 15
336 MANHATTAN SCRIE 172
337 MANHATTAN STRUCTURAL 3
338 MANHATTAN Sanitation Condition 169
339 MANHATTAN Scaffold Safety 4
340 MANHATTAN School Maintenance 24
341 MANHATTAN Senior Center Complaint 3
342 MANHATTAN Sewer 172
343 MANHATTAN Sidewalk Condition 118
344 MANHATTAN Smoking 53
345 MANHATTAN Special Enforcement 29
346 MANHATTAN Special Projects Inspection Team (SPIT) 38
347 MANHATTAN Standing Water 10
348 MANHATTAN Street Condition 834
349 MANHATTAN Street Light Condition 836
350 MANHATTAN Street Sign - Damaged 38
351 MANHATTAN Street Sign - Dangling 9
352 MANHATTAN Street Sign - Missing 28
353 MANHATTAN Sweeping/Missed-Inadequate 12
354 MANHATTAN Taxi Complaint 870
355 MANHATTAN Traffic 53
356 MANHATTAN Traffic Signal Condition 988
357 MANHATTAN Tunnel Condition 1
358 MANHATTAN Unleashed Dog 4
359 MANHATTAN Unsanitary Animal Pvt Property 12
360 MANHATTAN Unsanitary Pigeon Condition 11
361 MANHATTAN Urinating in Public 10
362 MANHATTAN Vacant Lot 8
363 MANHATTAN Vending 119
364 MANHATTAN Violation of Park Rules 46
365 MANHATTAN Water Conservation 21
366 MANHATTAN Water Quality 26
367 MANHATTAN Water System 961
368 MANHATTAN Window Guard 1
369 QUEENS APPLIANCE 67
370 QUEENS Air Quality 76
371 QUEENS Animal in a Park 16
372 QUEENS Asbestos 30
373 QUEENS BEST/Site Safety 5
374 QUEENS Beach/Pool/Sauna Complaint 1
375 QUEENS Bike/Roller/Skate Chronic 3
376 QUEENS Blocked Driveway 1890
377 QUEENS Boilers 17
378 QUEENS Bridge Condition 4
379 QUEENS Broken Muni Meter 361
380 QUEENS Building/Use 452
381 QUEENS Bus Stop Shelter Placement 5
382 QUEENS CONSTRUCTION 6
383 QUEENS City Vehicle Placard Complaint 12
384 QUEENS Collection Truck Noise 1
385 QUEENS Construction 406
386 QUEENS Consumer Complaint 284
387 QUEENS Cranes and Derricks 1
388 QUEENS Curb Condition 25
389 QUEENS DPR Internal 15
390 QUEENS DWD 1
391 QUEENS Damaged Tree 598
392 QUEENS Dead Tree 331
393 QUEENS Derelict Bicycle 11
394 QUEENS Derelict Vehicle 366
395 QUEENS Derelict Vehicles 317
396 QUEENS Dirty Conditions 991
397 QUEENS Disorderly Youth 3
398 QUEENS Drinking 25
399 QUEENS Drinking Water 4
400 QUEENS EAP Inspection - F59 3
401 QUEENS ELECTRIC 343
402 QUEENS Electrical 21
403 QUEENS Elevator 44
404 QUEENS Fire Alarm - New System 4
405 QUEENS Fire Alarm - Reinspection 6
406 QUEENS Fire Alarm - Replacement 1
407 QUEENS Fire Safety Director - F58 38
408 QUEENS Food Establishment 110
409 QUEENS Food Poisoning 31
410 QUEENS For Hire Vehicle Complaint 38
411 QUEENS GENERAL CONSTRUCTION 850
412 QUEENS General Construction/Plumbing 167
413 QUEENS Graffiti 133
414 QUEENS HEATING 1896
415 QUEENS Harboring Bees/Wasps 5
416 QUEENS Hazardous Materials 43
417 QUEENS Highway Condition 68
418 QUEENS Homeless Encampment 21
419 QUEENS Homeless Person Assistance 8
420 QUEENS Illegal Animal Kept as Pet 10
421 QUEENS Illegal Animal Sold 1
422 QUEENS Illegal Parking 887
423 QUEENS Illegal Tree Damage 50
424 QUEENS Indoor Air Quality 35
425 QUEENS Indoor Sewage 22
426 QUEENS Industrial Waste 25
427 QUEENS Internal Code 2
428 QUEENS Investigations and Discipline (IAD) 10
429 QUEENS Lead 39
430 QUEENS Litter Basket / Request 61
431 QUEENS Maintenance or Facility 90
432 QUEENS Missed Collection (All Materials) 319
433 QUEENS Mold 3
434 QUEENS Municipal Parking Facility 1
435 QUEENS NONCONST 575
436 QUEENS Noise 418
437 QUEENS Noise - Commercial 451
438 QUEENS Noise - Helicopter 9
439 QUEENS Noise - House of Worship 15
440 QUEENS Noise - Park 27
441 QUEENS Noise - Street/Sidewalk 226
442 QUEENS Noise - Vehicle 130
443 QUEENS Non-Residential Heat 17
444 QUEENS Other Enforcement 93
445 QUEENS Overflowing Litter Baskets 17
446 QUEENS Overgrown Tree/Branches 461
447 QUEENS PAINT - PLASTER 580
448 QUEENS PLUMBING 665
449 QUEENS Panhandling 16
450 QUEENS Plant 16
451 QUEENS Plumbing 22
452 QUEENS Posting Advertisement 1
453 QUEENS Public Payphone Complaint 2
454 QUEENS Public Toilet 3
455 QUEENS Recycling Enforcement 8
456 QUEENS Rodent 250
457 QUEENS Root/Sewer/Sidewalk Condition 360
458 QUEENS SCRIE 116
459 QUEENS STRUCTURAL 4
460 QUEENS Sanitation Condition 549
461 QUEENS Scaffold Safety 1
462 QUEENS School Maintenance 15
463 QUEENS Senior Center Complaint 2
464 QUEENS Sewer 656
465 QUEENS Sidewalk Condition 75
466 QUEENS Smoking 21
467 QUEENS Special Enforcement 79
468 QUEENS Special Projects Inspection Team (SPIT) 31
469 QUEENS Stalled Sites 1
470 QUEENS Standing Water 17
471 QUEENS Street Condition 797
472 QUEENS Street Light Condition 2122
473 QUEENS Street Sign - Damaged 73
474 QUEENS Street Sign - Dangling 32
475 QUEENS Street Sign - Missing 77
476 QUEENS Sweeping/Missed-Inadequate 54
477 QUEENS Tattooing 3
478 QUEENS Taxi Complaint 150
479 QUEENS Traffic 34
480 QUEENS Traffic Signal Condition 521
481 QUEENS Trans Fat 1
482 QUEENS Transportation Provider Complaint 1
483 QUEENS Unleashed Dog 13
484 QUEENS Unsanitary Animal Facility 2
485 QUEENS Unsanitary Animal Pvt Property 36
486 QUEENS Unsanitary Pigeon Condition 12
487 QUEENS Urinating in Public 6
488 QUEENS Vacant Lot 29
489 QUEENS Vending 49
490 QUEENS Violation of Park Rules 13
491 QUEENS Water Conservation 53
492 QUEENS Water Quality 12
493 QUEENS Water System 552
494 QUEENS X-Ray Machine/Equipment 1
495 STATEN ISLAND APPLIANCE 6
496 STATEN ISLAND Air Quality 15
497 STATEN ISLAND Animal Facility - No Permit 2
498 STATEN ISLAND Animal in a Park 13
499 STATEN ISLAND Asbestos 7
500 STATEN ISLAND BEST/Site Safety 1
501 STATEN ISLAND Blocked Driveway 111
502 STATEN ISLAND Boilers 2
503 STATEN ISLAND Broken Muni Meter 24
504 STATEN ISLAND Building/Use 61
505 STATEN ISLAND CONSTRUCTION 4
506 STATEN ISLAND City Vehicle Placard Complaint 1
507 STATEN ISLAND Consumer Complaint 36
508 STATEN ISLAND Curb Condition 2
509 STATEN ISLAND DPR Internal 6
510 STATEN ISLAND Damaged Tree 123
511 STATEN ISLAND Dead Tree 64
512 STATEN ISLAND Derelict Bicycle 1
513 STATEN ISLAND Derelict Vehicle 102
514 STATEN ISLAND Derelict Vehicles 19
515 STATEN ISLAND Dirty Conditions 100
516 STATEN ISLAND Disorderly Youth 6
517 STATEN ISLAND Drinking 2
518 STATEN ISLAND ELECTRIC 68
519 STATEN ISLAND Electrical 4
520 STATEN ISLAND Elevator 5
521 STATEN ISLAND Fire Alarm - Reinspection 1
522 STATEN ISLAND Fire Safety Director - F58 9
523 STATEN ISLAND Food Establishment 18
524 STATEN ISLAND Food Poisoning 6
525 STATEN ISLAND For Hire Vehicle Complaint 10
526 STATEN ISLAND GENERAL CONSTRUCTION 224
527 STATEN ISLAND General Construction/Plumbing 53
528 STATEN ISLAND Graffiti 17
529 STATEN ISLAND HEATING 169
530 STATEN ISLAND Harboring Bees/Wasps 2
531 STATEN ISLAND Hazardous Materials 5
532 STATEN ISLAND Highway Condition 8
533 STATEN ISLAND Homeless Encampment 2
534 STATEN ISLAND Illegal Animal Kept as Pet 2
535 STATEN ISLAND Illegal Animal Sold 1
536 STATEN ISLAND Illegal Parking 278
537 STATEN ISLAND Illegal Tree Damage 19
538 STATEN ISLAND Indoor Air Quality 9
539 STATEN ISLAND Indoor Sewage 4
540 STATEN ISLAND Industrial Waste 9
541 STATEN ISLAND Investigations and Discipline (IAD) 4
542 STATEN ISLAND Lead 4
543 STATEN ISLAND Litter Basket / Request 20
544 STATEN ISLAND Maintenance or Facility 23
545 STATEN ISLAND Missed Collection (All Materials) 178
546 STATEN ISLAND Mold 5
547 STATEN ISLAND NONCONST 116
548 STATEN ISLAND Noise 115
549 STATEN ISLAND Noise - Commercial 76
550 STATEN ISLAND Noise - House of Worship 2
551 STATEN ISLAND Noise - Park 4
552 STATEN ISLAND Noise - Street/Sidewalk 36
553 STATEN ISLAND Noise - Vehicle 25
554 STATEN ISLAND Non-Residential Heat 1
555 STATEN ISLAND Other Enforcement 27
556 STATEN ISLAND Overflowing Litter Baskets 11
557 STATEN ISLAND Overgrown Tree/Branches 134
558 STATEN ISLAND PAINT - PLASTER 124
559 STATEN ISLAND PLUMBING 139
560 STATEN ISLAND Plumbing 1
561 STATEN ISLAND Poison Ivy 3
562 STATEN ISLAND Public Payphone Complaint 3
563 STATEN ISLAND Recycling Enforcement 6
564 STATEN ISLAND Rodent 87
565 STATEN ISLAND Root/Sewer/Sidewalk Condition 144
566 STATEN ISLAND SCRIE 6
567 STATEN ISLAND STRUCTURAL 1
568 STATEN ISLAND Sanitation Condition 160
569 STATEN ISLAND School Maintenance 7
570 STATEN ISLAND Sewer 190
571 STATEN ISLAND Sidewalk Condition 16
572 STATEN ISLAND Smoking 1
573 STATEN ISLAND Snow 1
574 STATEN ISLAND Special Enforcement 11
575 STATEN ISLAND Special Natural Area District (SNAD) 3
576 STATEN ISLAND Special Projects Inspection Team (SPIT) 2
577 STATEN ISLAND Standing Water 17
578 STATEN ISLAND Street Condition 445
579 STATEN ISLAND Street Light Condition 449
580 STATEN ISLAND Street Sign - Damaged 43
581 STATEN ISLAND Street Sign - Dangling 17
582 STATEN ISLAND Street Sign - Missing 18
583 STATEN ISLAND Tattooing 1
584 STATEN ISLAND Taxi Complaint 1
585 STATEN ISLAND Traffic 14
586 STATEN ISLAND Traffic Signal Condition 101
587 STATEN ISLAND Unleashed Dog 6
588 STATEN ISLAND Unsanitary Animal Pvt Property 8
589 STATEN ISLAND Vacant Lot 25
590 STATEN ISLAND Vending 3
591 STATEN ISLAND Violation of Park Rules 6
592 STATEN ISLAND Water Conservation 10
593 STATEN ISLAND Water Quality 10
594 STATEN ISLAND Water System 326
595 Unspecified Agency Issues 174
596 Unspecified Benefit Card Replacement 387
597 Unspecified Complaint 9
598 Unspecified Compliment 9
599 Unspecified Consumer Complaint 37
600 Unspecified DCA / DOH New License Application Request 164
601 Unspecified DCA Literature Request 14
602 Unspecified DFTA Literature Request 2
603 Unspecified DHS Income Savings Requirement 1
604 Unspecified DOF Literature Request 5797
605 Unspecified DOT Literature Request 123
606 Unspecified DPR Literature Request 7
607 Unspecified Dirty Conditions 3
608 Unspecified Ferry Complaint 4
609 Unspecified Ferry Inquiry 32
610 Unspecified Ferry Permit 1
611 Unspecified For Hire Vehicle Complaint 3
612 Unspecified Found Property 1
613 Unspecified Highway Condition 3
614 Unspecified Invitation 3
615 Unspecified Legal Services Provider Complaint 2
616 Unspecified Maintenance or Facility 4
617 Unspecified Misc. Comments 2
618 Unspecified Noise 5
619 Unspecified Noise - Street/Sidewalk 1
620 Unspecified Noise - Vehicle 1
621 Unspecified OEM Literature Request 29
622 Unspecified Opinion for the Mayor 2
623 Unspecified Parking Card 8
624 Unspecified Request for Information 18
625 Unspecified SCRIE 9
626 Unspecified Sanitation Condition 3
627 Unspecified School Maintenance 8
628 Unspecified Senior Center Complaint 2
629 Unspecified Sewer 4
630 Unspecified Street Condition 6
631 Unspecified Street Light Condition 112
632 Unspecified Taxi Complaint 28
633 Unspecified Taxi Compliment 73
634 Unspecified Traffic Signal Condition 10
635 Unspecified Transportation Provider Complaint 1
636 Unspecified Vacant Lot 3
637 Unspecified Water System 2

Let's use a dictionary to keep track of the largest complaint type per borough.

In [95]:
max_per_borough = {}

for i, row in counts.iterrows():
    borough = row["Borough"]
    ctype = row["Complaint Type"]
    count = row["Count"]
    
    if borough not in max_per_borough:
        max_per_borough[borough] = (None, 0)
        
    max_ctype, max_count = max_per_borough[borough]
    
    if count > max_count:
        max_per_borough[borough] = (ctype, count)
In [96]:
max_per_borough
Out[96]:
{'BRONX': ('HEATING', 4093),
 'BROOKLYN': ('HEATING', 4548),
 'MANHATTAN': ('HEATING', 3494),
 'QUEENS': ('Street Light Condition', 2122),
 'STATEN ISLAND': ('Street Light Condition', 449),
 'Unspecified': ('DOF Literature Request', 5797)}

Or, we could use pandas! We can take the counts DataFrame and group it by borough:

In [100]:
counts.groupby('Borough').size()
Out[100]:
Borough
BRONX            118
BROOKLYN         125
MANHATTAN        126
QUEENS           126
STATEN ISLAND    100
Unspecified       43
dtype: int64

Now, we need to find the max value of Count in each borough:

In [104]:
max_counts = counts.groupby('Borough')["Count"].transform(max)
In [105]:
max_counts
Out[105]:
0      4093
1      4093
2      4093
3      4093
4      4093
5      4093
6      4093
7      4093
8      4093
9      4093
10     4093
11     4093
12     4093
13     4093
14     4093
15     4093
16     4093
17     4093
18     4093
19     4093
20     4093
21     4093
22     4093
23     4093
24     4093
25     4093
26     4093
27     4093
28     4093
29     4093
       ... 
608    5797
609    5797
610    5797
611    5797
612    5797
613    5797
614    5797
615    5797
616    5797
617    5797
618    5797
619    5797
620    5797
621    5797
622    5797
623    5797
624    5797
625    5797
626    5797
627    5797
628    5797
629    5797
630    5797
631    5797
632    5797
633    5797
634    5797
635    5797
636    5797
637    5797
Name: Count, dtype: int64

This actually replaces Count with the maximum value of Count for that borough. Now all we need to do is use fancy indexing:

In [106]:
counts[counts['Count'] == max_counts]
Out[106]:
Borough Complaint Type Count
39 BRONX HEATING 4093
163 BROOKLYN HEATING 4548
292 MANHATTAN HEATING 3494
472 QUEENS Street Light Condition 2122
579 STATEN ISLAND Street Light Condition 449
604 Unspecified DOF Literature Request 5797

Missing Data

In [132]:
complaints.info(memory_usage='deep')
<class 'pandas.core.frame.DataFrame'>
Int64Index: 111069 entries, 26589651 to 26428033
Data columns (total 51 columns):
Created Date                      111069 non-null object
Closed Date                       60270 non-null object
Agency                            111069 non-null object
Agency Name                       111069 non-null object
Complaint Type                    111069 non-null object
Descriptor                        110613 non-null object
Location Type                     79022 non-null object
Incident Zip                      98807 non-null object
Incident Address                  84441 non-null object
Street Name                       84432 non-null object
Cross Street 1                    84728 non-null object
Cross Street 2                    84005 non-null object
Intersection Street 1             19364 non-null object
Intersection Street 2             19366 non-null object
Address Type                      102247 non-null object
City                              98854 non-null object
Landmark                          95 non-null object
Facility Type                     19104 non-null object
Status                            111069 non-null object
Due Date                          39239 non-null object
Resolution Action Updated Date    96507 non-null object
Community Board                   111069 non-null object
Borough                           111069 non-null object
X Coordinate (State Plane)        98143 non-null float64
Y Coordinate (State Plane)        98143 non-null float64
Park Facility Name                111069 non-null object
Park Borough                      111069 non-null object
School Name                       111069 non-null object
School Number                     111048 non-null object
School Region                     110524 non-null object
School Code                       110524 non-null object
School Phone Number               111069 non-null object
School Address                    111069 non-null object
School City                       111069 non-null object
School State                      111069 non-null object
School Zip                        111069 non-null object
School Not Found                  38984 non-null object
School or Citywide Complaint      0 non-null float64
Vehicle Type                      99 non-null object
Taxi Company Borough              117 non-null object
Taxi Pick Up Location             1059 non-null object
Bridge Highway Name               185 non-null object
Bridge Highway Direction          185 non-null object
Road Ramp                         180 non-null object
Bridge Highway Segment            219 non-null object
Garage Lot Name                   49 non-null object
Ferry Direction                   24 non-null object
Ferry Terminal Name               70 non-null object
Latitude                          98143 non-null float64
Longitude                         98143 non-null float64
Location                          98143 non-null object
dtypes: float64(5), object(46)
memory usage: 278.1 MB
In [133]:
complaints["School Name"].value_counts()[:10]
Out[133]:
Unspecified                110434
Central Park                   33
Prospect Park                  13
J Hood Wright Park             13
Union Square Park              10
Hudson River Park              10
Riverside Park                  9
Macombs Dam Park                8
Sternberg Park                  8
School - PS 273 Wortman         7
Name: School Name, dtype: int64

We're storing the string "Unspecified" 110,434 times! What it really represents is missing data. The same is true of most of the "School" columns. We can see this either by quick inspection of the file, or by noticing that those fields have 111,069 "non-null" values. In some cases this makes sense (e.g., every complaint must have a complaint type associated with it), but in this case the "Unspecified" values should be null because they represent missing data.

In [134]:
complaints["School Name"] = complaints["School Name"].replace({"Unspecified": None})
In [135]:
complaints = complaints.replace({"Unspecified": None})
In [136]:
complaints["School Name"].value_counts()[:10]
Out[136]:
Central Park               33
Prospect Park              13
J Hood Wright Park         13
Hudson River Park          10
Union Square Park          10
Riverside Park              9
Macombs Dam Park            8
Sternberg Park              8
School - PS 273 Wortman     7
Mccarren Park               7
Name: School Name, dtype: int64
In [137]:
complaints.info(memory_usage='deep')
<class 'pandas.core.frame.DataFrame'>
Int64Index: 111069 entries, 26589651 to 26428033
Data columns (total 51 columns):
Created Date                      111069 non-null object
Closed Date                       60270 non-null object
Agency                            111069 non-null object
Agency Name                       111069 non-null object
Complaint Type                    111069 non-null object
Descriptor                        110613 non-null object
Location Type                     79022 non-null object
Incident Zip                      98807 non-null object
Incident Address                  84441 non-null object
Street Name                       84432 non-null object
Cross Street 1                    84728 non-null object
Cross Street 2                    84005 non-null object
Intersection Street 1             19364 non-null object
Intersection Street 2             19366 non-null object
Address Type                      102247 non-null object
City                              98854 non-null object
Landmark                          95 non-null object
Facility Type                     19104 non-null object
Status                            111069 non-null object
Due Date                          39239 non-null object
Resolution Action Updated Date    96507 non-null object
Community Board                   111069 non-null object
Borough                           103962 non-null object
X Coordinate (State Plane)        98143 non-null float64
Y Coordinate (State Plane)        98143 non-null float64
Park Facility Name                635 non-null object
Park Borough                      103962 non-null object
School Name                       635 non-null object
School Number                     614 non-null object
School Region                     90 non-null object
School Code                       90 non-null object
School Phone Number               635 non-null object
School Address                    635 non-null object
School City                       635 non-null object
School State                      635 non-null object
School Zip                        635 non-null object
School Not Found                  38984 non-null object
School or Citywide Complaint      0 non-null float64
Vehicle Type                      99 non-null object
Taxi Company Borough              117 non-null object
Taxi Pick Up Location             1059 non-null object
Bridge Highway Name               185 non-null object
Bridge Highway Direction          185 non-null object
Road Ramp                         180 non-null object
Bridge Highway Segment            219 non-null object
Garage Lot Name                   49 non-null object
Ferry Direction                   24 non-null object
Ferry Terminal Name               70 non-null object
Latitude                          98143 non-null float64
Longitude                         98143 non-null float64
Location                          98143 non-null object
dtypes: float64(5), object(46)
memory usage: 231.2 MB

Categorical Data

In [138]:
complaints["Borough"][:10]
Out[138]:
Unique Key
26589651       QUEENS
26593698       QUEENS
26594139    MANHATTAN
26595721    MANHATTAN
26590930    MANHATTAN
26592370       QUEENS
26595682       QUEENS
26595195       QUEENS
26590540    MANHATTAN
26594392     BROOKLYN
Name: Borough, dtype: object
In [139]:
complaints["Borough"].unique()
Out[139]:
array(['QUEENS', 'MANHATTAN', 'BROOKLYN', 'BRONX', 'STATEN ISLAND', None], dtype=object)
In [140]:
complaints["Borough"] = complaints["Borough"].astype('category')
In [141]:
complaints["Borough"][:10]
Out[141]:
Unique Key
26589651       QUEENS
26593698       QUEENS
26594139    MANHATTAN
26595721    MANHATTAN
26590930    MANHATTAN
26592370       QUEENS
26595682       QUEENS
26595195       QUEENS
26590540    MANHATTAN
26594392     BROOKLYN
Name: Borough, dtype: category
Categories (5, object): [BRONX, BROOKLYN, MANHATTAN, QUEENS, STATEN ISLAND]
In [143]:
category_cols = ["Agency", "Agency Name", "Complaint Type", "Descriptor",
                 "Location Type", "Address Type", "City", "Facility Type",
                 "Status", "Community Board", "Park Borough"]

for col in category_cols:
     complaints[col].astype('category')
In [144]:
complaints.info(verbose=False, memory_usage='deep')
<class 'pandas.core.frame.DataFrame'>
Int64Index: 111069 entries, 26589651 to 26428033
Columns: 51 entries, Created Date to Location
dtypes: category(1), float64(5), object(45)
memory usage: 224.7 MB

Grades dataset

grades.csv has (anonymized) grades from a college class. All the students in that class have graduated, and the names of the assignments have been changed to obscure what class this is from.

In [112]:
grades = pd.read_csv('grades.csv')
In [113]:
grades.describe()
Out[113]:
hw1 hw2 hw3 hw4 hw5 hw6 Midterm Final Total
count 72.000000 72.000000 72.000000 72.000000 72.000000 72.000000 72.000000 72.000000 72.000000
mean 89.314722 179.676944 170.680278 90.000000 85.111111 82.576389 74.009722 76.006944 82.227222
std 15.063493 28.775365 38.551670 9.457094 12.834324 25.453722 9.733422 10.222350 9.891911
min 45.000000 39.500000 38.000000 55.000000 50.000000 0.000000 44.500000 45.500000 50.950000
25% 89.500000 181.000000 160.240000 90.000000 80.750000 82.000000 67.657500 69.375000 80.320000
50% 95.000000 186.585000 189.450000 90.000000 87.000000 88.000000 74.815000 74.250000 84.090000
75% 100.000000 195.000000 195.427500 95.000000 95.500000 100.000000 81.817500 83.750000 88.405000
max 100.000000 200.000000 199.150000 100.000000 100.000000 100.000000 95.500000 96.000000 96.950000
In [114]:
grades.corr()
Out[114]:
hw1 hw2 hw3 hw4 hw5 hw6 Midterm Final Total
hw1 1.000000 0.487228 0.453700 0.409316 0.036674 0.070779 0.175797 0.056237 0.323693
hw2 0.487228 1.000000 0.710138 0.604343 0.121995 0.167828 0.402991 0.329742 0.579868
hw3 0.453700 0.710138 1.000000 0.613981 0.362415 0.566413 0.353350 0.315183 0.824843
hw4 0.409316 0.604343 0.613981 1.000000 0.161297 0.181675 0.177522 0.271350 0.483939
hw5 0.036674 0.121995 0.362415 0.161297 1.000000 0.414708 0.172973 0.147874 0.496866
hw6 0.070779 0.167828 0.566413 0.181675 0.414708 1.000000 0.168008 0.212730 0.800330
Midterm 0.175797 0.402991 0.353350 0.177522 0.172973 0.168008 1.000000 0.508609 0.526797
Final 0.056237 0.329742 0.315183 0.271350 0.147874 0.212730 0.508609 1.000000 0.584716
Total 0.323693 0.579868 0.824843 0.483939 0.496866 0.800330 0.526797 0.584716 1.000000
In [115]:
grades["hw2"].describe()
Out[115]:
count     72.000000
mean     179.676944
std       28.775365
min       39.500000
25%      181.000000
50%      186.585000
75%      195.000000
max      200.000000
Name: hw2, dtype: float64
In [116]:
grades["hw2"] = grades["hw2"] / 2
In [117]:
grades["hw2"].describe()
Out[117]:
count     72.000000
mean      89.838472
std       14.387683
min       19.750000
25%       90.500000
50%       93.292500
75%       97.500000
max      100.000000
Name: hw2, dtype: float64
In [118]:
grades["hw3"] = grades["hw3"] / 2
In [119]:
grades[["hw1","hw2","hw3","hw4","hw5","hw6"]].mean()
Out[119]:
hw1    89.314722
hw2    89.838472
hw3    85.340139
hw4    90.000000
hw5    85.111111
hw6    82.576389
dtype: float64
In [120]:
grades[["hw1","hw2","hw3","hw4","hw5","hw6"]].mean(axis=1)
Out[120]:
0     97.542500
1     94.430833
2     96.308333
3     94.430833
4     96.357500
5     96.357500
6     92.655833
7     93.184167
8     95.298333
9     94.789167
10    95.298333
11    98.942500
12    89.498333
13    94.789167
14    96.308333
15    84.232500
16    97.542500
17    94.059167
18    97.061667
19    96.609167
20    93.435000
21    91.570833
22    93.537500
23    88.260000
24    93.267500
25    84.232500
26    89.149167
27    90.345833
28    90.474167
29    93.435000
        ...    
42    92.398333
43    87.196667
44    92.655833
45    90.353333
46    86.800000
47    87.455833
48    89.149167
49    94.756667
50    90.353333
51    89.498333
52    87.455833
53    80.174167
54    86.800000
55    87.054167
56    87.054167
57    93.537500
58    78.988333
59    69.807500
60    80.174167
61    87.196667
62    69.807500
63    75.238333
64    72.497500
65    62.604167
66    72.497500
67    49.625000
68    61.416667
69    61.416667
70    62.604167
71    49.625000
dtype: float64
In [121]:
grades["hw avg"] = grades[["hw1","hw2","hw3","hw4","hw5","hw6"]].mean(axis=1)
In [122]:
grades.describe()
Out[122]:
hw1 hw2 hw3 hw4 hw5 hw6 Midterm Final Total hw avg
count 72.000000 72.000000 72.000000 72.000000 72.000000 72.000000 72.000000 72.000000 72.000000 72.000000
mean 89.314722 89.838472 85.340139 90.000000 85.111111 82.576389 74.009722 76.006944 82.227222 87.030139
std 15.063493 14.387683 19.275835 9.457094 12.834324 25.453722 9.733422 10.222350 9.891911 11.118237
min 45.000000 19.750000 19.000000 55.000000 50.000000 0.000000 44.500000 45.500000 50.950000 49.625000
25% 89.500000 90.500000 80.120000 90.000000 80.750000 82.000000 67.657500 69.375000 80.320000 86.158125
50% 95.000000 93.292500 94.725000 90.000000 87.000000 88.000000 74.815000 74.250000 84.090000 90.353333
75% 100.000000 97.500000 97.713750 95.000000 95.500000 100.000000 81.817500 83.750000 88.405000 94.152083
max 100.000000 100.000000 99.575000 100.000000 100.000000 100.000000 95.500000 96.000000 96.950000 98.942500
In [123]:
grades = grades[["hw1","hw2","hw3","hw4","hw5","hw6","hw avg","Midterm","Final","Total"]]
In [124]:
grades.describe()
Out[124]:
hw1 hw2 hw3 hw4 hw5 hw6 hw avg Midterm Final Total
count 72.000000 72.000000 72.000000 72.000000 72.000000 72.000000 72.000000 72.000000 72.000000 72.000000
mean 89.314722 89.838472 85.340139 90.000000 85.111111 82.576389 87.030139 74.009722 76.006944 82.227222
std 15.063493 14.387683 19.275835 9.457094 12.834324 25.453722 11.118237 9.733422 10.222350 9.891911
min 45.000000 19.750000 19.000000 55.000000 50.000000 0.000000 49.625000 44.500000 45.500000 50.950000
25% 89.500000 90.500000 80.120000 90.000000 80.750000 82.000000 86.158125 67.657500 69.375000 80.320000
50% 95.000000 93.292500 94.725000 90.000000 87.000000 88.000000 90.353333 74.815000 74.250000 84.090000
75% 100.000000 97.500000 97.713750 95.000000 95.500000 100.000000 94.152083 81.817500 83.750000 88.405000
max 100.000000 100.000000 99.575000 100.000000 100.000000 100.000000 98.942500 95.500000 96.000000 96.950000
In [125]:
grades.corr()
Out[125]:
hw1 hw2 hw3 hw4 hw5 hw6 hw avg Midterm Final Total
hw1 1.000000 0.487228 0.453700 0.409316 0.036674 0.070779 0.554078 0.175797 0.056237 0.323693
hw2 0.487228 1.000000 0.710138 0.604343 0.121995 0.167828 0.704075 0.402991 0.329742 0.579868
hw3 0.453700 0.710138 1.000000 0.613981 0.362415 0.566413 0.917449 0.353350 0.315183 0.824843
hw4 0.409316 0.604343 0.613981 1.000000 0.161297 0.181675 0.642298 0.177522 0.271350 0.483939
hw5 0.036674 0.121995 0.362415 0.161297 1.000000 0.414708 0.512807 0.172973 0.147874 0.496866
hw6 0.070779 0.167828 0.566413 0.181675 0.414708 1.000000 0.702948 0.168008 0.212730 0.800330
hw avg 0.554078 0.704075 0.917449 0.642298 0.512807 0.702948 1.000000 0.351264 0.322977 0.906070
Midterm 0.175797 0.402991 0.353350 0.177522 0.172973 0.168008 0.351264 1.000000 0.508609 0.526797
Final 0.056237 0.329742 0.315183 0.271350 0.147874 0.212730 0.322977 0.508609 1.000000 0.584716
Total 0.323693 0.579868 0.824843 0.483939 0.496866 0.800330 0.906070 0.526797 0.584716 1.000000

Healthcare dataset

Here's some ACA data from 2012:

In [111]:
counties = pd.read_csv('uninsured.csv', index_col='FIPS')
aca_enrollment = pd.read_csv('aca_enrollment.csv', index_col='State')
In [108]:
counties[:5]
Out[108]:
Geography State 2012 Population 2012 Estimated Number Uninsured
FIPS
1001 Autauga County Alabama 54264 5720
1003 Baldwin County Alabama 184805 25515
1005 Barbour County Alabama 24435 4223
1007 Bibb County Alabama 21438 3238
1009 Blount County Alabama 57155 7183
In [112]:
aca_enrollment[:5]
Out[112]:
State_Partisanship Marketplace_Type Federal_grants_awarded Medicaid_Expansion Pre-open_Enrollment_Monthly_Average_Medicaid_and_CHIP_Enrollment_July-Sept_2013 Total_Medicaid_and_CHIP_Enrollment_April_2014_Preliminary Number_of_Individuals_Who_Have_Selected_a_Marketplace_Plan QHP_target
State
Alabama Rep FFM 9772451 No 799176.0 765286 97870 77000
Alaska Rep FFM 0 No 120946.0 123212 12890 21000
Arizona Rep FFM 30877097 Yes 1201770.0 1345403 120071 131000
Arkansas Dem FFM 58149831 Yes 680920.0 829729 43446 53000
California Dem SBE 1065683056 Yes 9157000.0 10600000 1405102 686000

We want to answer the following question:

Each state uses a federal exchange (FFM) or a state-based exchange (SBE). What is the percentage of uninsured population in all the FFM states and in all the SBE states?

To answer this question, we need to merge two data sets.

First, we need to compute the total population and uninsured people in each state. We can do this with groupby and aggregate.

In [117]:
pop_totals = counties.groupby("State").aggregate(sum)
In [119]:
pop_totals[:5]
Out[119]:
2012 Population 2012 Estimated Number Uninsured
State
Alabama 4477466 628475
Alaska 555247 99167
Arizona 6361836 1090722
Arkansas 2407692 397994
California 37087767 6731712

We also want the percentage of uninsured:

In [120]:
pop_totals["Percent uninsured"] = 100 * pop_totals["2012 Estimated Number Uninsured"] / pop_totals["2012 Population"]
In [122]:
pop_totals[:5]
Out[122]:
2012 Population 2012 Estimated Number Uninsured Percent uninsured
State
Alabama 4477466 628475 14.036399
Alaska 555247 99167 17.859979
Arizona 6361836 1090722 17.144768
Arkansas 2407692 397994 16.530104
California 37087767 6731712 18.150761
In [126]:
aca_enrollment_with_pop = pd.merge(pop_totals, aca_enrollment, left_index=True, right_index=True)

Note: In this case, the join column happened to be the index in both dataframes. That is not always the case. If the "State" column wasn't an index (in either dataframe) we would've written something like:

pd.merge(pop_totals, aca_enrollment, on="State")
In [127]:
aca_enrollment_with_pop[:5]
Out[127]:
2012 Population 2012 Estimated Number Uninsured Percent uninsured State_Partisanship Marketplace_Type Federal_grants_awarded Medicaid_Expansion Pre-open_Enrollment_Monthly_Average_Medicaid_and_CHIP_Enrollment_July-Sept_2013 Total_Medicaid_and_CHIP_Enrollment_April_2014_Preliminary Number_of_Individuals_Who_Have_Selected_a_Marketplace_Plan QHP_target
State

What happened? Turns out the county data has a space before the name of each state!

In [134]:
counties.iloc[0]["State"]
Out[134]:
' Alabama'
In [137]:
counties["State"] = counties["State"].str.strip()
pop_totals = counties.groupby("State").aggregate(sum)
pop_totals["Percent uninsured"] = 100 * pop_totals["2012 Estimated Number Uninsured"] / pop_totals["2012 Population"]
In [138]:
aca_enrollment_with_pop = pd.merge(pop_totals, aca_enrollment, left_index=True, right_index=True)
In [139]:
aca_enrollment_with_pop[:5]
Out[139]:
2012 Population 2012 Estimated Number Uninsured Percent uninsured State_Partisanship Marketplace_Type Federal_grants_awarded Medicaid_Expansion Pre-open_Enrollment_Monthly_Average_Medicaid_and_CHIP_Enrollment_July-Sept_2013 Total_Medicaid_and_CHIP_Enrollment_April_2014_Preliminary Number_of_Individuals_Who_Have_Selected_a_Marketplace_Plan QHP_target
State
Alabama 4477466 628475 14.036399 Rep FFM 9772451 No 799176.0 765286 97870 77000
Alaska 555247 99167 17.859979 Rep FFM 0 No 120946.0 123212 12890 21000
Arizona 6361836 1090722 17.144768 Rep FFM 30877097 Yes 1201770.0 1345403 120071 131000
Arkansas 2407692 397994 16.530104 Dem FFM 58149831 Yes 680920.0 829729 43446 53000
California 37087767 6731712 18.150761 Dem SBE 1065683056 Yes 9157000.0 10600000 1405102 686000
In [144]:
by_exch_type = aca_enrollment_with_pop.groupby("Marketplace_Type").aggregate(sum)[["2012 Population", "2012 Estimated Number Uninsured"]]
In [145]:
by_exch_type
Out[145]:
2012 Population 2012 Estimated Number Uninsured
Marketplace_Type
FFM 192600064 30222152
SBE 101813370 14174778
In [146]:
by_exch_type["Percent uninsured"] = 100 * by_exch_type["2012 Estimated Number Uninsured"] / by_exch_type["2012 Population"]
In [148]:
by_exch_type
Out[148]:
2012 Population 2012 Estimated Number Uninsured Percent uninsured
Marketplace_Type
FFM 192600064 30222152 15.691662
SBE 101813370 14174778 13.922315

Similarly, we can do this by party:

In [149]:
by_party = aca_enrollment_with_pop.groupby("State_Partisanship").aggregate(sum)[["2012 Population", "2012 Estimated Number Uninsured"]]
by_party["Percent uninsured"] = 100 * by_party["2012 Estimated Number Uninsured"] / by_party["2012 Population"]
In [150]:
by_party
Out[150]:
2012 Population 2012 Estimated Number Uninsured Percent uninsured
State_Partisanship
Dem 130659500 17769259 13.599669
Rep 163753934 26627671 16.260782