Skip to content

Commit 1ab99d3

Browse files
authored
Merge pull request #1270 from ELC/scipy-jp-2020
Add Scipy Japan 2020
2 parents 2e50c6f + b621932 commit 1ab99d3

File tree

20 files changed

+467
-0
lines changed

20 files changed

+467
-0
lines changed

scipy-jp-2020/category.json

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
{
2+
"title": "Scipy Japan 2020"
3+
}
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
{
2+
"description": "Apache Arrow is a cross-language development platform for in-memory data. You can use Apache Arrow to process large data effectively in Python and other languages such as R. Apache Arrow is the future of data processing. Apache Arrow 1.0, the first major version, will be released soon. It's a good time to know Apache Arrow and start using it.\n\nApache Arrowはインメモリデータのための言語横断的な開発プラットフォームです。Apache Arrowを使って、PythonやRなどの他の言語で大容量データを効率的に処理することができます。Apache Arrowはデータ処理の未来です。最初のメジャーバージョンであるApache Arrow 1.0がまもなくリリースされます。この機会にApache Arrowを知って、使い始めてみてはいかがでしょうか。",
3+
"duration": 1339,
4+
"language": "eng",
5+
"recorded": "2020-10-30",
6+
"related_urls": [
7+
{
8+
"label": "Conference Website",
9+
"url": "https://www.scipyjapan.scipy.org/"
10+
}
11+
],
12+
"speakers": [
13+
"Kouhei Sutou"
14+
],
15+
"tags": [],
16+
"thumbnail_url": "https://i.ytimg.com/vi_webp/MxvYG-EjO3A/maxresdefault.webp",
17+
"title": "Apache Arrow 1.0 - A Cross-Language Development Platform for In-Memory Data",
18+
"videos": [
19+
{
20+
"type": "youtube",
21+
"url": "https://www.youtube.com/watch?v=MxvYG-EjO3A"
22+
}
23+
]
24+
}
Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
{
2+
"description": "Naoki Yoshii, Kenji Matogawa\n\nTo achieve efficient development of semiconductor processes/materials, a method for predicting optimal process/materials conditions using machine learning, open information (papers) and material databases was implemented using Python.\n\n半導体プロセス/材料の効率的な開発を実現するために、機械学習、オープン情報(論文)、マテリアルデータベースを利用した最適なプロセス/材料条件を予測する方法をPythonにより実施した。",
3+
"duration": 1121,
4+
"language": "eng",
5+
"recorded": "2020-10-30",
6+
"related_urls": [
7+
{
8+
"label": "Conference Website",
9+
"url": "https://www.scipyjapan.scipy.org/"
10+
}
11+
],
12+
"speakers": [
13+
"Naoki Yoshii",
14+
"Kenji Matogawa"
15+
],
16+
"tags": [],
17+
"thumbnail_url": "https://i.ytimg.com/vi_webp/3JKjrWGcGpg/maxresdefault.webp",
18+
"title": "Applying Machine Learning to R&D for Semiconductor Process Development",
19+
"videos": [
20+
{
21+
"type": "youtube",
22+
"url": "https://www.youtube.com/watch?v=3JKjrWGcGpg"
23+
}
24+
]
25+
}
Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
{
2+
"description": "Mercari provides an image search feature, which makes it possible for users to find similar items by image. This talk describes how we implemented similar image search within 100s of millions of images, in a way that is accurate. We will also highlight the techniques we used to keep the system efficient and update to date.\n\nメルカリには画像検索機能があり、ユーザーは画像から類似品を探すことができます。本講演では、数百万枚の画像の中から類似画像検索をどのようにして正確に実装したかを説明します。また、システムを効率的に更新するために使ったテクニックも紹介します。",
3+
"duration": 823,
4+
"language": "eng",
5+
"recorded": "2020-10-30",
6+
"related_urls": [
7+
{
8+
"label": "Conference Website",
9+
"url": "https://www.scipyjapan.scipy.org/"
10+
}
11+
],
12+
"speakers": [
13+
"Wakana Nogami",
14+
"Sandeep Subramanian"
15+
],
16+
"tags": [],
17+
"thumbnail_url": "https://i.ytimg.com/vi_webp/QDkc08voRyo/maxresdefault.webp",
18+
"title": "Approx Vector Search @Scale with App Image Search",
19+
"videos": [
20+
{
21+
"type": "youtube",
22+
"url": "https://www.youtube.com/watch?v=QDkc08voRyo"
23+
}
24+
]
25+
}
Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
{
2+
"description": "Using the example of a chatbot I will show how a complex ML system can be built in an iterative fashion with feasibility and scalability in mind, starting from simple but proven methods and incrementally constructing an increasingly powerful pipeline of algorithms. I will discuss a range of natural language processing tools and techniques in Python, from scikit-learn's inbuilt text processing capabilities, to the advanced NLP library spaCy, all the way to neural networks and deep learning using PyTorch. Besides the technical aspects, I will present real case studies of how we at Bespoke have applied this approach.\n\nチャットボットの例を用いて、複雑なMLシステムがどのようにして実現可能性とスケーラビリティを念頭に置いた反復的な方法で構築されるかを紹介します。私は、scikit-learnの組み込みテキスト処理機能から、高度なNLPライブラリspaCy、PyTorchを使ったニューラルネットワークと深層学習まで、Pythonを使った自然言語処理ツールとテクニックの範囲について議論します。技術的な側面に加えて、私たちがBESPOKEでどのようにこのアプローチを適用したかの実際のケーススタディを紹介します。",
3+
"duration": 1228,
4+
"language": "eng",
5+
"recorded": "2020-10-30",
6+
"related_urls": [
7+
{
8+
"label": "Conference Website",
9+
"url": "https://www.scipyjapan.scipy.org/"
10+
}
11+
],
12+
"speakers": [
13+
"Max Frenzel"
14+
],
15+
"tags": [],
16+
"thumbnail_url": "https://i.ytimg.com/vi_webp/KtceW_s5kvs/maxresdefault.webp",
17+
"title": "Building a Scalable AI Chatbot From Regex to Deep Learning",
18+
"videos": [
19+
{
20+
"type": "youtube",
21+
"url": "https://www.youtube.com/watch?v=KtceW_s5kvs"
22+
}
23+
]
24+
}
Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
{
2+
"description": "Kubeflow is a tool for managing machine learning workflows on Kubernetes. In this talk, I will explain how we can use Kubeflow to take on the challenges of machine learning beyond model training, including data preprocessing, model evaluation, and deployment--the steps that together make up a full machine learning pipeline. I will introduce Kubeflow and explain the central concepts of the platform using basic examples. Then I will show how my team is using Kubeflow to tackle real-world problems in the real estate domain.\n\nKubeflowは、Kubernetes上で機械学習のワークフローを管理するためのツールです。 本講演では、Kubeflowを使って、データの前処理、モデルの評価、デプロイメントなど、モデル学習の枠を超えた機械学習への挑戦、つまり、機械学習のパイプラインを構成するステップをまとめて解説します。Kubeflowを紹介し、基本的な例を用いてプラットフォームの中心的な概念を説明します。その後、私のチームがどのようにKubeflowを使用して不動産分野の実世界の問題に取り組んでいるかを紹介します。",
3+
"duration": 1248,
4+
"language": "eng",
5+
"recorded": "2020-10-30",
6+
"related_urls": [
7+
{
8+
"label": "Conference Website",
9+
"url": "https://www.scipyjapan.scipy.org/"
10+
}
11+
],
12+
"speakers": [
13+
"William Horton"
14+
],
15+
"tags": [],
16+
"thumbnail_url": "https://i.ytimg.com/vi_webp/JwBNYDqtQ9I/maxresdefault.webp",
17+
"title": "Building Machine Learning Pipelines with Kubeflow",
18+
"videos": [
19+
{
20+
"type": "youtube",
21+
"url": "https://www.youtube.com/watch?v=JwBNYDqtQ9I"
22+
}
23+
]
24+
}
Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
{
2+
"description": "CuPy is an open-source library with NumPy-compatible API and brings high performance N-dimensional array computation with utilizing Nvidia GPU. Users can get several times speed improvement from drop-in replacement to their NumPy-based code in most cases. CuPy is actively being developed and is continuously well-maintained, resulting in 4,200+ GitHub stars and 17,000+ commits. CuPy was also presented at PyCon 2018.\n\nCuPyは、NumPyと互換性のあるAPIを持つオープンソースのライブラリで、Nvidia GPUを利用した高性能なN次元配列計算を実現します。ほとんどの場合、NumPyベースのコードをドロップインで置き換えることで、数倍の速度向上を得ることができます。CuPyは積極的に開発されており、継続的によくメンテナンスされており、4,200以上のGitHubスターと17,000以上のコミットを得ています。CuPyはPyCon 2018でも発表されました。",
3+
"duration": 1164,
4+
"language": "eng",
5+
"recorded": "2020-10-30",
6+
"related_urls": [
7+
{
8+
"label": "Conference Website",
9+
"url": "https://www.scipyjapan.scipy.org/"
10+
}
11+
],
12+
"speakers": [
13+
"Masayuki Takagi"
14+
],
15+
"tags": [],
16+
"thumbnail_url": "https://i.ytimg.com/vi_webp/xG1nz3R8H7I/maxresdefault.webp",
17+
"title": "CuPy: A Numpy-Compatible Library for HPC with GPU",
18+
"videos": [
19+
{
20+
"type": "youtube",
21+
"url": "https://www.youtube.com/watch?v=xG1nz3R8H7I"
22+
}
23+
]
24+
}
Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
{
2+
"description": "This talk introduces dask-image, a python library for distributed image processing. Targeted towards applications involving large array data too big to fit in memory, dask-image is built on top of numpy, scipy, and dask allowing easy scalability and portability from your laptop to the supercomputing cluster. It is of broad interest to a diverse range of scientific fields including astronomy, geosciences, microscopy, and climate sciences. We will provide a general overview of the dask-image library, then discuss mixing and matching with your own custom functions, and present a practical case study of a python image processing pipeline.\n\nこの講演では、分散画像処理のための python ライブラリである dask-image を紹介します。メモリに収まりきらないほど大きな配列データを扱うアプリケーションをターゲットにしており、dask-image は numpy, scipy, dask の上に構築されているため、ラップトップからスーパーコンピュータクラスタへの拡張性と移植性を容易にします。天文学、地球科学、顕微鏡、気候科学などの多様な科学分野に広く関心があります。dask-image ライブラリの一般的な概要を説明した後、独自のカスタム関数との混合やマッチングについて議論し、Python 画像処理パイプラインの実践的なケーススタディを紹介します。",
3+
"duration": 1208,
4+
"language": "eng",
5+
"recorded": "2020-10-30",
6+
"related_urls": [
7+
{
8+
"label": "Conference Website",
9+
"url": "https://www.scipyjapan.scipy.org/"
10+
}
11+
],
12+
"speakers": [
13+
"Genevieve Buckley"
14+
],
15+
"tags": [],
16+
"thumbnail_url": "https://i.ytimg.com/vi_webp/dP0m2iZX0PU/maxresdefault.webp",
17+
"title": "Dask Image - Distributed Image Processing for Large Data",
18+
"videos": [
19+
{
20+
"type": "youtube",
21+
"url": "https://www.youtube.com/watch?v=dP0m2iZX0PU"
22+
}
23+
]
24+
}
Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
{
2+
"description": "Deep learning models are powered by data, the abundance of it. Most of the deep learning models that empower many critical applications in our day-to-day lives depend on labeled data. Affording humongous amounts of labeled datasets is not always possible for a number of different factors like budget, human resources, lack of expert annotators, and so on. The internet is practically an infinite source of unlabeled data. So, at this point, an important question that gets raised is - how can we effectively utilize this large pool of unlabeled data for developing better deep learning models? Self-supervised learning can be helpful here to answer questions like this.\n\nディープラーニングモデルは、豊富なデータを動力源としています。私たちの日常生活における多くの重要なアプリケーションに力を与えているディープラーニングモデルのほとんどは、ラベル付きデータに依存しています。膨大な量のラベル付きデータセットを提供することは、予算、人的資源、専門家のアノテータの不足など、さまざまな要因により常に可能とは限りません。インターネットには、事実上、ラベル付けされていないデータが無限に存在します。そこで、この時点で重要な疑問が浮かび上がってきます-どのようにして、より良いディープラーニングモデルを開発するために、この大規模な非ラベルデータのプールを効果的に利用することができるのでしょうか?自己教師付き学習は、このような疑問に答えるのに役立ちます。",
3+
"duration": 1238,
4+
"language": "eng",
5+
"recorded": "2020-10-30",
6+
"related_urls": [
7+
{
8+
"label": "Conference Website",
9+
"url": "https://www.scipyjapan.scipy.org/"
10+
}
11+
],
12+
"speakers": [
13+
"Sayak Paul"
14+
],
15+
"tags": [],
16+
"thumbnail_url": "https://i.ytimg.com/vi_webp/3686AJmoDTA/maxresdefault.webp",
17+
"title": "Demystifying Self-Supervised Learning for Visual Recognition",
18+
"videos": [
19+
{
20+
"type": "youtube",
21+
"url": "https://www.youtube.com/watch?v=3686AJmoDTA"
22+
}
23+
]
24+
}
Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
{
2+
"description": "Hyperparameters are manual, often hard-coded, settings in programming. Some examples are selection of optimizers or learning rates in data science, database performance tuning, or video compression settings. These values and choices may seem incidental to the programming task, but can be extremely important for performance. Tuning them to find the right values can be difficult and time-consuming. This talk introduces Optuna, an open-source, eager interface, Python framework that automates the process of tuning hyperparameters using blackbox optimization, and highlights the new features and integration modules for other open-source projects available in v1.0.\n\nハイパーパラメータとは、プログラミングにおける手動の設定のことで、多くの場合はハードコーディングされています。例としては、データサイエンスにおけるオプティマイザや学習率の選択、データベースのパフォーマンスチューニング、動画圧縮の設定などがあります。これらの値や選択は、プログラミングのタスクには付随的に見えるかもしれませんが、パフォーマンスにとっては非常に重要です。これらの値をチューニングして正しい値を見つけるのは難しく、時間のかかる作業です。この講演では、ブラックボックス最適化を使用してハイパーパラメータのチューニングプロセスを自動化する、オープンソースのイーガーインターフェース、PythonフレームワークであるOptunaを紹介し、v1.0で利用可能になった他のオープンソースプロジェクトの新機能と統合モジュールをハイライトします。",
3+
"duration": 1256,
4+
"language": "eng",
5+
"recorded": "2020-10-30",
6+
"related_urls": [
7+
{
8+
"label": "Conference Website",
9+
"url": "https://www.scipyjapan.scipy.org/"
10+
}
11+
],
12+
"speakers": [
13+
"Crissman Loomis"
14+
],
15+
"tags": [],
16+
"thumbnail_url": "https://i.ytimg.com/vi_webp/GGqtmmB5t-U/maxresdefault.webp",
17+
"title": "Hyperparameters - Autotuning to Make Performance Sing with Optuna",
18+
"videos": [
19+
{
20+
"type": "youtube",
21+
"url": "https://www.youtube.com/watch?v=GGqtmmB5t-U"
22+
}
23+
]
24+
}

0 commit comments

Comments
 (0)