Skip to content

Commit a59d334

Browse files
DanielYang59Qazalbashjanosh
authored
Change authors to list[dict['name' | 'url' | ..., str]] (#70)
* insert shebang and script check * pre-commit run * convert to dict of name and url * convert more to dict * convert publications * remove comma * Update data/packages.yml Co-authored-by: Meesum Qazalbash <[email protected]> * fix __main__ loop and define class Author(TypedDict) TODO author validation * make script path absolute * revert reposition of module level vars * reapply somehow disappeared abs dir change * raise ValueError on non-https author URLs fix yaml whitespace --------- Co-authored-by: Meesum Qazalbash <[email protected]> Co-authored-by: Janosh Riebesell <[email protected]>
1 parent 1293330 commit a59d334

8 files changed

+577
-234
lines changed

data/applications.yml

+59-9
Original file line numberDiff line numberDiff line change
@@ -1,47 +1,97 @@
11
- title: Latent Space Policies for Hierarchical Reinforcement Learning
22
url: https://arxiv.org/abs/1804.02808
33
date: 2018-04-09
4-
authors: Tuomas Haarnoja, Kristian Hartikainen, Pieter Abbeel, Sergey Levine
4+
authors:
5+
- name: Tuomas Haarnoja
6+
- name: Kristian Hartikainen
7+
- name: Pieter Abbeel
8+
- name: Sergey Levine
59
description: Uses normalizing flows, specifically RealNVPs, as policies for reinforcement learning and also applies them for the hierarchical reinforcement learning setting.
610

711
- title: Analyzing Inverse Problems with Invertible Neural Networks
812
url: https://arxiv.org/abs/1808.04730
913
date: 2018-08-14
10-
authors: Lynton Ardizzone, Jakob Kruse, Sebastian Wirkert, Daniel Rahner, Eric W. Pellegrini, Ralf S. Klessen, Lena Maier-Hein, Carsten Rother, Ullrich Köthe
14+
authors:
15+
- name: Lynton Ardizzone
16+
- name: Jakob Kruse
17+
- name: Sebastian Wirkert
18+
- name: Daniel Rahner
19+
- name: Eric W. Pellegrini
20+
- name: Ralf S. Klessen
21+
- name: Lena Maier-Hein
22+
- name: Carsten Rother
23+
- name: Ullrich Köthe
1124
description: Normalizing flows for inverse problems.
1225

1326
- title: NeuTra-lizing Bad Geometry in Hamiltonian Monte Carlo Using Neural Transport
1427
url: https://arxiv.org/abs/1903.03704
1528
date: 2019-03-09
16-
authors: Matthew Hoffman, Pavel Sountsov, Joshua V. Dillon, Ian Langmore, Dustin Tran, Srinivas Vasudevan
29+
authors:
30+
- name: Matthew Hoffman
31+
- name: Pavel Sountsov
32+
- name: Joshua V. Dillon
33+
- name: Ian Langmore
34+
- name: Dustin Tran
35+
- name: Srinivas Vasudevan
1736
description: Uses normalizing flows in conjunction with Monte Carlo estimation to have more expressive distributions and better posterior estimation.
1837

19-
- title: 'SRFlow: Learning the Super-Resolution Space with Normalizing Flow'
38+
- title: "SRFlow: Learning the Super-Resolution Space with Normalizing Flow"
2039
url: https://arxiv.org/abs/2006.14200
2140
date: 2020-06-25
22-
authors: Andreas Lugmayr, Martin Danelljan, Luc Van Gool, Radu Timofte
41+
authors:
42+
- name: Andreas Lugmayr
43+
- name: Martin Danelljan
44+
- name: Luc Van Gool
45+
- name: Radu Timofte
2346
description: Uses normalizing flows for super-resolution.
2447

2548
- title: Faster Uncertainty Quantification for Inverse Problems with Conditional Normalizing Flows
2649
url: https://arxiv.org/abs/2007.07985
2750
date: 2020-07-15
28-
authors: Ali Siahkoohi, Gabrio Rizzuti, Philipp A. Witte, Felix J. Herrmann
51+
authors:
52+
- name: Ali Siahkoohi
53+
- name: Gabrio Rizzuti
54+
- name: Philipp A. Witte
55+
- name: Felix J. Herrmann
2956
description: Uses conditional normalizing flows for inverse problems. [[Video](https://youtu.be/nPvZIKaRBkI)]
3057

3158
- title: Targeted free energy estimation via learned mappings
3259
url: https://aip.scitation.org/doi/10.1063/5.0018903
3360
date: 2020-10-13
34-
authors: Peter Wirnsberger, Andrew J. Ballard, George Papamakarios, Stuart Abercrombie, Sébastien Racanière, Alexander Pritzel, Danilo Jimenez Rezende, Charles Blundell
61+
authors:
62+
- name: Peter Wirnsberger
63+
- name: Andrew J. Ballard
64+
- name: George Papamakarios
65+
- name: Stuart Abercrombie
66+
- name: Sébastien Racanière
67+
- name: Alexander Pritzel
68+
- name: Danilo Jimenez Rezende
69+
- name: Charles Blundell
3570
description: Normalizing flows used to estimate free energy differences.
3671

3772
- title: On the Sentence Embeddings from Pre-trained Language Models
3873
url: https://aclweb.org/anthology/2020.emnlp-main.733
3974
date: 2020-11-02
40-
authors: Bohan Li, Hao Zhou, Junxian He, Mingxuan Wang, Yiming Yang, Lei Li
75+
authors:
76+
- name: Bohan Li
77+
- name: Hao Zhou
78+
- name: Junxian He
79+
- name: Mingxuan Wang
80+
- name: Yiming Yang
81+
- name: Lei Li
4182
description: Proposes to use flows to transform anisotropic sentence embedding distributions from BERT to a smooth and isotropic Gaussian, learned through unsupervised objective. Demonstrates performance gains over SOTA sentence embeddings on semantic textual similarity tasks. Code available at <https://github.com/bohanli/BERT-flow>.
4283

4384
- title: Normalizing Kalman Filters for Multivariate Time Series Analysis
4485
url: https://assets.amazon.science/ea/0c/88b7bdd54eae8c08983fa4cc3e06/normalizing-kalman-filters-for-multivariate-time-series-analysis.pdf
4586
date: 2020-12-06
46-
authors: Emmanuel de Bézenac, Syama Sundar Rangapuram, Konstantinos Benidis, Michael Bohlke-Schneider, Richard Kurle, Lorenzo Stella, Hilaf Hasson, Patrick Gallinari, Tim Januschowski
87+
authors:
88+
- name: Emmanuel de Bézenac
89+
- name: Syama Sundar Rangapuram
90+
- name: Konstantinos Benidis
91+
- name: Michael Bohlke-Schneider
92+
- name: Richard Kurle
93+
- name: Lorenzo Stella
94+
- name: Hilaf Hasson
95+
- name: Patrick Gallinari
96+
- name: Tim Januschowski
4797
description: Augments state space models with normalizing flows and thereby mitigates imprecisions stemming from idealized assumptions. Aimed at forecasting real-world data and handling varying levels of missing data. (Also available at [Amazon Science](https://amazon.science/publications/normalizing-kalman-filters-for-multivariate-time-series-analysis).)

data/make_readme.py

100644100755
+112-89
Original file line numberDiff line numberDiff line change
@@ -1,25 +1,36 @@
1+
#!/usr/bin/env python3
2+
13
"""Script to generate readme.md from data/*.yml files."""
24

35
import datetime
6+
import os
47
import re
5-
from os.path import dirname
68
from typing import TypedDict
79

810
import yaml
911

10-
ROOT = dirname(dirname(__file__))
12+
ROOT_DIR = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
13+
14+
15+
class Author(TypedDict):
16+
"""An author of a paper or application."""
17+
18+
name: str
19+
url: str | None
20+
affiliation: str | None
21+
github: str | None
22+
orcid: str | None
1123

1224

1325
class Item(TypedDict):
1426
"""An item in a readme section like a paper or package."""
1527

1628
title: str
17-
authors: str
29+
authors: list[Author]
1830
date: datetime.date
1931
lang: str
2032
url: str
2133
description: str
22-
authors_url: str | None
2334
repo: str | None
2435
date_added: datetime.date | None
2536

@@ -44,7 +55,7 @@ class Section(TypedDict):
4455

4556
def load_items(key: str) -> list[Item]:
4657
"""Load list[Item] from YAML file."""
47-
with open(f"{ROOT}/data/{key}.yml", encoding="utf8") as file:
58+
with open(f"{ROOT_DIR}/data/{key}.yml", encoding="utf8") as file:
4859
return yaml.safe_load(file.read())
4960

5061

@@ -53,10 +64,9 @@ def load_items(key: str) -> list[Item]:
5364
for key in titles # markdown is set below
5465
}
5566

56-
5767
seen_titles: set[tuple[str, str]] = set()
5868
required_keys = {"title", "url", "date", "authors", "description"}
59-
optional_keys = {"authors_url", "lang", "repo", "docs", "date_added", "last_updated"}
69+
optional_keys = {"lang", "repo", "docs", "date_added", "last_updated"}
6070
valid_languages = {"PyTorch", "TensorFlow", "JAX", "Julia", "Other"}
6171
et_al_after = 2
6272

@@ -72,7 +82,7 @@ def validate_item(itm: Item, section_title: str) -> None:
7282
else:
7383
seen_titles.add((title, section_title))
7484

75-
if section_title in ("packages", "repos") and itm["lang"] not in valid_languages:
85+
if section_title in {"packages", "repos"} and itm["lang"] not in valid_languages:
7686
errors += [
7787
f"Invalid lang in {title}: {itm['lang']}, must be one of {valid_languages}"
7888
]
@@ -101,87 +111,100 @@ def validate_item(itm: Item, section_title: str) -> None:
101111
raise ValueError("\n".join(errors))
102112

103113

104-
for key, section in sections.items():
105-
# Keep lang_names inside sections loop to refill language subsections for each new
106-
# section. Used by both repos and Packages. Is a list for order and mutability.
107-
lang_names = ["PyTorch", "TensorFlow", "JAX", "Julia", "Other"]
108-
109-
# sort first by language with order determined by lang_names (only applies to
110-
# Package and repos sections), then by date
111-
section["items"].sort(key=lambda x: x["date"], reverse=True)
112-
if key in ("packages", "repos"):
113-
section["items"].sort(key=lambda itm: lang_names.index(itm["lang"]))
114-
115-
# add item count after section title
116-
section["markdown"] += f" <small>({len(section['items'])})</small>\n\n"
117-
118-
for itm in section["items"]:
119-
if (lang := itm.get("lang")) in lang_names:
120-
lang_names.remove(lang)
121-
# print language subsection title if this is the first item with that lang
122-
section["markdown"] += (
123-
f'<br>\n\n### <img src="assets/{lang.lower()}.svg" alt="{lang}" '
124-
f'height="20px"> &nbsp;{lang} {key.title()}\n\n'
114+
if __name__ == "__main__":
115+
for key, section in sections.items():
116+
# Keep lang_names inside sections loop to refill language
117+
# subsections for each new section. Used by both repos and Packages.
118+
# Is a list for order and mutability.
119+
lang_names = ["PyTorch", "TensorFlow", "JAX", "Julia", "Other"]
120+
121+
# sort first by language with order determined by lang_names (only applies to
122+
# Package and repos sections), then by date
123+
section["items"].sort(key=lambda x: x["date"], reverse=True)
124+
if key in ("packages", "repos"):
125+
section["items"].sort(key=lambda itm: lang_names.index(itm["lang"]))
126+
127+
# add item count after section title
128+
section["markdown"] += f" <small>({len(section['items'])})</small>\n\n"
129+
130+
for itm in section["items"]:
131+
if (lang := itm.get("lang")) in lang_names:
132+
lang_names.remove(lang)
133+
# print language subsection title if this is the first item
134+
# with that language
135+
section["markdown"] += (
136+
f'<br>\n\n### <img src="assets/{lang.lower()}.svg" alt="{lang}" '
137+
f'height="20px"> &nbsp;{lang} {key.title()}\n\n'
138+
)
139+
140+
validate_item(itm, section["title"])
141+
142+
authors = itm["authors"]
143+
date = itm["date"]
144+
description = itm["description"]
145+
title = itm["title"]
146+
url = itm["url"]
147+
148+
if key in ("publications", "applications"):
149+
# only show people's last name for papers
150+
authors = [
151+
auth | {"name": auth["name"].split(" ")[-1]} for auth in authors
152+
]
153+
154+
def auth_str(auth: Author) -> str:
155+
"""Return a markdown string for an author."""
156+
auth_str = auth["name"]
157+
if url := auth.get("url"):
158+
if not url.startswith("https://"):
159+
raise ValueError(
160+
f"Invalid author {url=}, must start with https://"
161+
)
162+
auth_str = f"[{auth_str}]({url})"
163+
return auth_str
164+
165+
authors_str = ", ".join(map(auth_str, authors[:et_al_after]))
166+
if len(authors) > et_al_after:
167+
authors_str += " et al."
168+
169+
md_str = f"1. {date} - [{title}]({url}) by {authors_str}"
170+
171+
if key in ("packages", "repos") and url.startswith("https://github.com"):
172+
gh_login, repo_name = url.split("/")[3:5]
173+
md_str += (
174+
f'\n&ensp;\n<img src="https://img.shields.io/github/stars/'
175+
f'{gh_login}/{repo_name}" alt="GitHub repo stars"'
176+
' valign="middle" />'
177+
)
178+
179+
md_str += "<br>\n " + description.removesuffix("\n")
180+
if docs := itm.get("docs"):
181+
md_str += f" [[Docs]({docs})]"
182+
if repo := itm.get("repo"):
183+
md_str += f" [[Code]({repo})]"
184+
185+
section["markdown"] += md_str + "\n\n"
186+
187+
with open(f"{ROOT_DIR}/readme.md", "r+", encoding="utf8") as file:
188+
readme = file.read()
189+
190+
for section in sections.values():
191+
# look ahead without matching
192+
section_start_pat = f"(?<={section['title']})"
193+
# look behind without matching
194+
next_section_pat = "(?=<br>\n\n## )"
195+
196+
# match everything up to next heading
197+
readme = re.sub(
198+
rf"{section_start_pat}[\s\S]+?\n\n{next_section_pat}",
199+
section["markdown"],
200+
readme,
125201
)
126202

127-
validate_item(itm, section["title"])
128-
129-
authors = itm["authors"]
130-
date = itm["date"]
131-
description = itm["description"]
132-
title = itm["title"]
133-
url = itm["url"]
134-
135-
author_list = authors.split(", ")
136-
if key in ("publications", "applications"):
137-
# only show people's last name for papers
138-
author_list = [author.split(" ")[-1] for author in author_list]
139-
authors = ", ".join(author_list[:et_al_after])
140-
if len(author_list) > et_al_after:
141-
authors += " et al."
142-
143-
if authors_url := itm.get("authors_url"):
144-
authors = f"[{authors}]({authors_url})"
145-
146-
md_str = f"1. {date} - [{title}]({url}) by {authors}"
147-
148-
if key in ("packages", "repos") and url.startswith("https://github.com"):
149-
gh_login, repo_name = url.split("/")[3:5]
150-
md_str += (
151-
f'\n&ensp;\n<img src="https://img.shields.io/github/stars/'
152-
f'{gh_login}/{repo_name}" alt="GitHub repo stars" valign="middle" />'
153-
)
154-
155-
md_str += "<br>\n " + description.removesuffix("\n")
156-
if docs := itm.get("docs"):
157-
md_str += f" [[Docs]({docs})]"
158-
if repo := itm.get("repo"):
159-
md_str += f" [[Code]({repo})]"
160-
161-
section["markdown"] += md_str + "\n\n"
162-
163-
164-
with open(f"{ROOT}/readme.md", "r+", encoding="utf8") as file:
165-
readme = file.read()
166-
167-
for section in sections.values():
168-
# look ahead without matching
169-
section_start_pat = f"(?<={section['title']})"
170-
# look behind without matching
171-
next_section_pat = "(?=<br>\n\n## )"
172-
173-
# match everything up to next heading
174-
readme = re.sub(
175-
rf"{section_start_pat}[\s\S]+?\n\n{next_section_pat}",
176-
section["markdown"],
177-
readme,
178-
)
179-
180-
file.seek(0)
181-
file.write(readme)
182-
file.truncate()
203+
file.seek(0)
204+
file.write(readme)
205+
file.truncate()
183206

184-
section_counts = "\n".join(
185-
f"- {key}: {len(sec['items'])}" for key, sec in sections.items()
186-
)
187-
print(f"finished writing {len(seen_titles)} items to readme:\n{section_counts}") # noqa: T201
207+
section_counts = "\n".join(
208+
f"- {key}: {len(sec['items'])}" for key, sec in sections.items()
209+
)
210+
print(f"finished writing {len(seen_titles)} items to readme:\n{section_counts}") # noqa: T201

0 commit comments

Comments
 (0)