Skip to content

Commit bce2c2a

Browse files
authored
Merge pull request #989 from oree-xx/second_post
Add second half blog post
2 parents 9f36d47 + 59415c5 commit bce2c2a

File tree

1 file changed

+105
-0
lines changed
  • content/blog/entries/2026-30-03-the-end-of-an-era

1 file changed

+105
-0
lines changed
Lines changed: 105 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,105 @@
1+
title: Quantifying the Commons: The end of an era
2+
---
3+
categories:
4+
open-source
5+
collaboration
6+
community
7+
quantifying-the-commons
8+
---
9+
author: Oreoluwa
10+
---
11+
pub_date: 2026-03-03
12+
---
13+
body:
14+
15+
16+
## Quantifying the Commons: The end of an era
17+
18+
Dear gentle reader,
19+
20+
It is the end of an era yet the beginning of my bloom as a young aspiring data
21+
professional on a global stage. It feels so surreal to be at the end of this
22+
amazing journey with my mentors and to see the quantifying commons become a
23+
mature project in the creative commons open source community. Quantifying the
24+
commons is also blooming so stay tuned to experience its impact in different
25+
teams at Creative Commons.
26+
27+
Looking back, I was quite nervous on my first meeting with Timid Robot and
28+
Sara. I did not quite understand the automation part of the project, how long
29+
the scripts ran? Why? I was fascinated by the whole process of the system,
30+
after further explanation by Timid Robot I was really impressed by the design
31+
thinking. A lot of details and critical thinking were put into implementing the
32+
system. Big kudos to the project lead and previous contributors, I am in love
33+
with the foundation being put in place prior to my contribution. It is a firm
34+
one and it made my work easier and worthwhile.
35+
36+
37+
## Day 1 was amazing, Day 90 is growth!
38+
39+
I went from being confused with concepts used in the codebase to suggesting
40+
ideas on improving the automation process in the system. I constantly read
41+
articles, tested, iterated and improvised functions and mechanisms. I improved
42+
on my data structure and algorithm skills, I had to cater for test cases,
43+
limitations and risk. Risk in the sense that the system is exposed to change
44+
because the data is live and dynamic from the API. This is what I did in the
45+
first half of my internship
46+
[here](https://opensource.creativecommons.org/blog/entries/2026-01-22-My-outreachy-journey/).
47+
I would be focusing on the second half of the internship in this blog post. A
48+
big part of the project is ensuring the integrity of data is in sync with the
49+
efficiency of the automation process.
50+
51+
52+
### Automating the Smithsonian quarterly report
53+
54+
Smithsonian is one of the largest public institutions in the United States. It
55+
has a total of 38 units/data sources like museums, zoos and libraries as of
56+
when I worked on it. We derived insights on the usage of CC0 license across the
57+
media records and records without media. This urged me to add the horizontal
58+
stacked barplot to the collection of visualization in the report system. From
59+
this, we could get the distribution of the records with CC0 licenses at a
60+
glance. Also, we explored the top 10 distribution of units and lowest 10
61+
distribution of units. This meaningfully tells us how common the CC0 license is
62+
used in these institutions. After testing the whole workflow a couple of times,
63+
I detected that the unit code seems to be updated frequently whether added or
64+
removed. I developed a function that keeps track of these changes and gives a
65+
warning about changes in the next automation process. This was the best way
66+
possible at the moment to handle the sudden unit code, so that our data is
67+
quite predictable and updated.
68+
69+
70+
### Automating the arXiv quarterly report
71+
72+
Arxiv is a curated research-sharing platform with 5 million monthly active
73+
users and hosts 2.6 million research papers. We derived quite interesting
74+
insights from this data source. Then expanded the visualization collection in
75+
plot.py by adding the function for line plot and vertical stacked barplot. The
76+
insights include the count of legal tools on a yearly basis and various
77+
comparative analysis of the tools over the years. We also explored the
78+
breakdown of these tools usage in different categories.
79+
80+
81+
## Lessons learned
82+
83+
I learnt so much about creating a structure when solving a problem. It is quite
84+
easier to debug and it presents a detailed workflow for future contributors to
85+
understand what has been done previously. It literally boils down to how you
86+
name your variable or how you use it in a function. I also learnt the
87+
importance of asking why. Timid Robot encouraged me to always question
88+
assumptions and understand the reasoning behind decisions. This was the best
89+
thing to do because it made the whole internship fun and puzzling. Things
90+
became naturally logical and I could connect the dots quite easily.
91+
92+
93+
## What Next!
94+
95+
I hope to continue volunteering my time on the project going forward. I am also
96+
eager to explore other open-source projects involving research, big data, and
97+
automation, and to further align these skill sets with my background in
98+
actuarial science.
99+
100+
101+
## Goodbye for now
102+
103+
I really enjoyed working with my mentors, I will miss our little chit chats
104+
about the holidays, the weather and even vacation trips. I look forward to
105+
catching up again in the future.

0 commit comments

Comments
 (0)