curriculum/unused_content.md at main · PyNoon/curriculum

The first "column" in the DataFrame is the index, which defaults to incrementing integers
Like how each column has a name, the index is the "name" of each row
We can assign a column to be the index of a DataFrame:

listings_df = listings_df.set_index('id')

listings_df

Why do we need to assign the result of set_index()?

Calling .set_index() does not change the original DataFrame value
Calling .set_index() returns a new DataFrame value with the index changed, which we then assign to the original variable.
Most Pandas methods return a new value rather than changing the original value.

We can perform indexing and slicing on DataFrames using .iloc:

To get the first row:

listings_df.iloc[0]

To get the second column in the first row:

listings_df.iloc[0, 1]

To get the second column of the first five rows:

listings_df.iloc[0:5, 1]

To get the second column of all rows:

listings_df.iloc[:, 1]

We can also index and slice rows and columns by their names:

To get a single row by it's name in the index:

listings_df.loc['l9995141']

To get several rows by their names:

listings_df.loc[['l9995141', 'l12026015', 'l44688136']]

While you can use : slicing to specify a start and end names for a range, it is more common to specify a list of names.

To get the name column of all rows:

listings_df.loc[:, 'name']

1d. Selection of Individual Values

Use sorting and indexing on listing_df to find: