Skip to content

Commit 7d1e0c2

Browse files
committed
markdown source builds
Auto-generated via `{sandpaper}` Source : e8ab4fb Branch : main Author : Andrew Gait <andrew.gait@manchester.ac.uk> Time : 2026-04-13 10:26:37 +0000 Message : Merge pull request #58 from UoMResearchIT/39-typo-in-units-deg-is-not-a-callable deg is not callable
1 parent 0df0fe1 commit 7d1e0c2

File tree

5 files changed

+64
-65
lines changed

5 files changed

+64
-65
lines changed

02-dictionaries.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -350,7 +350,7 @@ This package provides the method `json.load()` to read JSON data from a file and
350350
```python
351351
import json
352352

353-
with open('ro-crate-metadata-1.json') as f:
353+
with open('data/ro-crate-metadata-1.json') as f:
354354
data = json.load(f)
355355
```
356356

06-units_and_quantities.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -295,7 +295,7 @@ u.rad.find_equivalent_units()
295295
We can see from this that the degree unit is `u.deg`, so we can use this to define our angles:
296296

297297
```python
298-
angle = 90 * u.deg()
298+
angle = 90 * u.deg
299299
print('angle in degrees: {}; and in radians: {}'.format(angle.value,angle.to(u.rad).value))
300300
```
301301

07-pandas_essential.md

Lines changed: 38 additions & 38 deletions
Original file line numberDiff line numberDiff line change
@@ -41,15 +41,15 @@ with a text editor and look at the data layout.
4141
The data within this file is organised much as you'd expect the data within a spreadsheet. The first row of the file contains the headers for each of the columns. The first column contains the name of the countries, while the remaining columns contain the GDP values for these countries for each year. Pandas has the `read_csv` function for reading structured data such as this, which makes reading the file easy:
4242

4343
```python
44-
data = pd.read_csv('data/gapminder_gdp_europe.csv',index_col='country')
44+
df = pd.read_csv('data/gapminder_gdp_europe.csv',index_col='country')
4545
```
4646

4747
Here we specify that the `country` column should be used as the index column (`index_col`).
4848

4949
This creates a `DataFrame` object containing the dataset. This is similar to a numpy array, but has a number of significant differences. The first is that there are more ways to quickly understand a pandas dataframe. For example, the `info` function gives an overview of the data types and layout of the DataFrame:
5050

5151
```python
52-
data.info()
52+
df.info()
5353
```
5454

5555
```output
@@ -74,10 +74,10 @@ dtypes: float64(12)
7474
memory usage: 3.0+ KB
7575
```
7676

77-
You can also carry out quick analysis of the data using the `describe` function:
77+
You can also carry out quick analysis of the DataFrame using the `describe` function:
7878

7979
```python
80-
data.describe()
80+
df.describe()
8181
```
8282

8383
```output
@@ -94,58 +94,58 @@ max 14734.232750 17909.489730 20431.092700 ...
9494

9595
## Accessing elements, rows, and columns
9696

97-
The other major difference to numpy arrays is that we cannot directly access the array elements using numerical indices such as `data[0,0]`. It is possible to access columns of data using the column headers as indices (for example, `data['gdpPercap_1952']`), but this is not recommended. Instead you should use the `iloc` and `loc` methods.
97+
The other major difference to numpy arrays is that we cannot directly access the array elements using numerical indices such as `df[0,0]`. It is possible to access columns of data using the column headers as indices (for example, `df['gdpPercap_1952']`), but this is not recommended. Instead you should use the `iloc` and `loc` methods.
9898

9999
The `iloc` method enables us to access the DataFrame as we would a numpy array:
100100

101101
```python
102-
print(data.iloc[0,0])
102+
print(df.iloc[0,0])
103103
```
104104

105105
while the `loc` method enables the same access using the index and column headers:
106106

107107
```python
108-
print(data.loc["Albania", "gdpPercap_1952"])
108+
print(df.loc["Albania", "gdpPercap_1952"])
109109
```
110110

111111
For both of these methods, we can leave out the column indexes, and these will all be returned for the specified index row:
112112

113113
```python
114-
print(data.loc["Albania"])
114+
print(df.loc["Albania"])
115115
```
116116

117-
This will not work for column headings (in the inverse of the `data['gdpPercap_1952']` method) however. While it is quick to type, we recommend trying to avoid using this method of slicing the DataFrame, in favour of the methods described below.
117+
This will not work for column headings (in the inverse of the `df['gdpPercap_1952']` method) however. While it is quick to type, we recommend trying to avoid using this method of slicing the DataFrame, in favour of the methods described below.
118118

119119
For both of these methods we can use the `:` character to select all elements in a row or column. For example, to get all information for Albania:
120120

121121
```python
122-
print(data.loc["Albania", :])
122+
print(df.loc["Albania", :])
123123
```
124124

125125
or:
126126

127127
```python
128-
print(data.iloc[0, :])
128+
print(df.iloc[0, :])
129129
```
130130

131131
The `:` character by itself is shorthand to indicate all elements across that indice, but it can also be combined with index values or column headers to specify a slice of the DataArray:
132132

133133
```python
134-
print(data.loc["Albania", 'gdpPercap_1962':'gdpPercap_1972'])
134+
print(df.loc["Albania", 'gdpPercap_1962':'gdpPercap_1972'])
135135
```
136136

137137
If either end of the slice definition is omitted, then the slice will run to the end of that indice (just as it does for `:` by itself):
138138

139139
```python
140-
print(data.loc["Albania", 'gdpPercap_1962':])
140+
print(df.loc["Albania", 'gdpPercap_1962':])
141141
```
142142

143143
Slices can also be defined using a list of indexes or column headings:
144144

145145
```python
146146
year_list = ['gdpPercap_1952','gdpPercap_1967','gdpPercap_1982','gdpPercap_1997']
147147
country_list = ['Albania','Belgium']
148-
print(data.loc[country_list, year_list])
148+
print(df.loc[country_list, year_list])
149149
```
150150

151151
```output
@@ -157,10 +157,10 @@ Belgium 8343.105127 13149.041190 20979.845890 27561.196630
157157

158158
## Masking data
159159

160-
Pandas data arrays are based on numpy arrays, and retain some of the numpy tools, such as masked arrays. This enables us to apply selection criteria to the datasets, so that only the values that we require are shown. For example, the following selects all data where the GDP is above $10,000:
160+
Pandas Dataframes are based on numpy arrays, and retain some of the numpy tools, such as masked arrays. This enables us to apply selection criteria to the Dataframes, so that only the values that we require are shown. For example, the following selects all data where the GDP is above $10,000:
161161

162162
```python
163-
subset = data.loc['Italy':'Poland', 'gdpPercap_1962':'gdpPercap_1972']
163+
subset = df.loc['Italy':'Poland', 'gdpPercap_1962':'gdpPercap_1972']
164164
print(subset[subset>10000])
165165
```
166166

@@ -179,7 +179,7 @@ Poland NaN NaN NaN
179179
Pandas is integrated with matplotlib, and so data can be plotted directly using the integrated `plot` method. For example, to plot the GDP for Sweden:
180180

181181
```python
182-
data.loc['Sweden',:].plot()
182+
df.loc['Sweden',:].plot()
183183
plt.xticks(rotation=90)
184184
```
185185

@@ -191,7 +191,7 @@ Note that, in the case above, we passed a single column of data to the `plot` me
191191
For example, we will transpose the GDP data for the first 3 countries in our dataset:
192192

193193
```python
194-
print(data.iloc[0:3,:].T)
194+
print(df.iloc[0:3,:].T)
195195
```
196196

197197
```output
@@ -214,7 +214,7 @@ This data is now ready to be plotted as a histogram - first we set the style of
214214

215215
```python
216216
plt.style.use('ggplot')
217-
data.iloc[0:3,:].T.plot(kind='bar')
217+
df.iloc[0:3,:].T.plot(kind='bar')
218218
plt.xticks(rotation=90)
219219
plt.ylabel('GDP per capita')
220220
```
@@ -237,7 +237,7 @@ axis (`axs`) objects. Pass the axis object to pandas when plotting your figure
237237

238238
```python
239239
fig, axs = plt.subplots()
240-
data.loc['Albania':'Belgium',:].T.plot(kind='bar',ax=axs)
240+
df.loc['Albania':'Belgium',:].T.plot(kind='bar',ax=axs)
241241
plt.xticks(rotation=90)
242242
plt.ylabel('GDP per capita')
243243
fig.savefig('albania-austria-belgium_GDP.png', bbox_inches='tight')
@@ -251,26 +251,26 @@ fig.savefig('albania-austria-belgium_GDP.png', bbox_inches='tight')
251251

252252
Note that the x-tick labels have been taken directly from the index values of the transposed DataFrame (which were the original column labels). These don't really need to be more than the year of the GDP values, so we could change the column labels to reflect this.
253253

254-
First we make a new copy of the dataframe (in case anything goes wrong):
254+
First we make a new copy of the DataFrame (in case anything goes wrong):
255255

256256
```python
257-
gdpPercap = data.copy(deep=True)
257+
df_gdpPercap = df.copy(deep=True)
258258
```
259259

260-
We have given this new dataframe a more appropriate name, replacing the information that will be removed from the column headers.
260+
We have given this new DataFrame a more appropriate name, replacing the information that will be removed from the column headers.
261261

262262
Now we will use the inbuilt `str.strip` method to clean up our column labels for the new
263-
dataframe. Which of these commands is correct:
263+
DataFrame. Which of these commands is correct:
264264

265-
1. `gdpPercap.columns = data.columns.str.strip('gdpPercap_')`
266-
2. `gdpPercap = data.columns.str.strip('gdpPercap_')`
265+
1. `df_gdpPercap.columns = df.columns.str.strip('gdpPercap_')`
266+
2. `df_gdpPercap = df.columns.str.strip('gdpPercap_')`
267267

268268
::::::::::::::: solution
269269

270270
## Solution
271271

272272
The correct answer is 1. We have to pass the new column labels explicitly back to the
273-
array columns, otherwise all we do is replace the data array with a list of the new
273+
array columns, otherwise all we do is replace the DataFrame with a list of the new
274274
column labels.
275275

276276

@@ -287,7 +287,7 @@ Now that we've cleaned up the column labels, we now want to plot the GDP data fo
287287
Sweden and Iceland from 1972 onwards. The code block we will be using is:
288288

289289
```python
290-
gdp_percap<BLOCK>.T.plot(kind='line')
290+
df_gdpPercap<BLOCK>.T.plot(kind='line')
291291

292292
# Create legend.
293293
plt.legend(loc='upper left')
@@ -297,18 +297,18 @@ plt.ylabel('GDP per capita ($)')
297297

298298
Which of the following blocks of code should replace the `<BLOCK>` in the code above?
299299

300-
1. `.loc['Sweden':'Iceland','gdpPercap_1972':]`
301-
2. `.loc['gdpPercap_1972':,['Sweden','Iceland']]`
302-
3. `.loc[['Sweden','Iceland'],'gdpPercap_1972':]`
303-
4. `.loc['gdpPercap_1972':,'Sweden':'Iceland']`
300+
1. `.loc['Sweden':'Iceland','1972':]`
301+
2. `.loc['1972':,['Sweden','Iceland']]`
302+
3. `.loc[['Sweden','Iceland'],'1972':]`
303+
4. `.loc['1972':,'Sweden':'Iceland']`
304304

305305
::::::::::::::: solution
306306

307307
## Solution
308308

309309
The correct answer is 3. The two countries are not adjacent in the dataset, so we need
310310
to use a list to slice them, not a range (disqualifying answers 1 and 4). At the point
311-
where we select the countries using `.loc`, we have not yet transposed the dataset
311+
where we select the countries using `.loc`, we have not yet transposed the DataFrame
312312
(using `.T`), so the country names are still indexes, not column labels, and therefore
313313
need to be referenced first (ie in the first set of square brackets), (disqualifying
314314
answers 2 and 4).
@@ -323,11 +323,11 @@ answers 2 and 4).
323323

324324
:::::::::::::::::::::::::::::::::::::::: keypoints
325325

326-
- CSV data is loaded using the `load_csv()` function
327-
- The `describe()` function gives a quick analysis of the data
328-
- `loc[<index>,<column>]` indexes the data array by the index and column labels
329-
- `iloc[<index>,<column>]` indexes the data array using numerical indicies
330-
- The data can be sliced by providing index and/or column indicies as ranges or lists of values
326+
- CSV data is loaded using the `read_csv()` function to create a pandas `DataFrame` object
327+
- The `describe()` function gives a quick analysis of the DataFrame
328+
- `loc[<index>,<column>]` indexes the DataFrame by the index and column labels
329+
- `iloc[<index>,<column>]` indexes the DataFrame using numerical indices
330+
- The data can be sliced by providing index and/or column indices as ranges or lists of values
331331
- The built-in `plot()` function can be used to plot the data using the `matplotlib` library
332332

333333
::::::::::::::::::::::::::::::::::::::::::::::::::

md5sum.txt

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -5,15 +5,15 @@
55
"index.md" "b92970dbb54d63d6543a4c28e4df0a58" "site/built/index.md" "2025-04-15"
66
"links.md" "8184cf4149eafbf03ce8da8ff0778c14" "site/built/links.md" "2025-04-15"
77
"episodes/01-introduction.md" "c9d4152827292039d98bda76eac83684" "site/built/01-introduction.md" "2025-04-15"
8-
"episodes/02-dictionaries.md" "4d66dd7a84d0a6476fbd327812b7a0d4" "site/built/02-dictionaries.md" "2025-04-15"
8+
"episodes/02-dictionaries.md" "a3e25c74d90bef8f93c72bf320ba0a1b" "site/built/02-dictionaries.md" "2026-04-13"
99
"episodes/03-numpy_essential.md" "8be0e5afb556d4f597f31f059488dd15" "site/built/03-numpy_essential.md" "2025-04-15"
1010
"episodes/04-software_package_management.md" "62db2ba2a4291e16d2bdd07b77bfc961" "site/built/04-software_package_management.md" "2025-04-15"
1111
"episodes/05-defensive_programming.md" "357bca7361e09066ad71393e21ad0c74" "site/built/05-defensive_programming.md" "2025-04-15"
12-
"episodes/06-units_and_quantities.md" "aa0215a27f41d8c99db495c727c56051" "site/built/06-units_and_quantities.md" "2025-11-17"
13-
"episodes/07-pandas_essential.md" "5dde09b1125f8c30156ae314b5fdcfe1" "site/built/07-pandas_essential.md" "2025-04-15"
12+
"episodes/06-units_and_quantities.md" "7e259f66d6f8889505c252a04082481f" "site/built/06-units_and_quantities.md" "2026-04-13"
13+
"episodes/07-pandas_essential.md" "443db701e191bb47de503d890d31d69f" "site/built/07-pandas_essential.md" "2026-04-13"
1414
"instructors/instructor-notes.md" "a59fd3b94c07c3fe3218c054a0f03277" "site/built/instructor-notes.md" "2025-04-15"
1515
"learners/discuss.md" "2758e2e5abd231d82d25c6453d8abbc6" "site/built/discuss.md" "2025-04-15"
1616
"learners/reference.md" "28853696c21dd22dfb883e9c241e95fb" "site/built/reference.md" "2025-04-15"
1717
"learners/setup.md" "b49173664b6f5d6f31f11fb8bf67f620" "site/built/setup.md" "2025-04-15"
1818
"profiles/learner-profiles.md" "60b93493cf1da06dfd63255d73854461" "site/built/learner-profiles.md" "2025-04-15"
19-
"renv/profiles/lesson-requirements/renv.lock" "db0fb2ee29e76e6cf403b8e7be6bbb50" "site/built/renv.lock" "2025-11-11"
19+
"renv/profiles/lesson-requirements/renv.lock" "27e4855ba3f7161ae21bda4f9d207141" "site/built/renv.lock" "2026-04-13"

0 commit comments

Comments
 (0)