Description
Update:
With invaluable assistance from @Vermillion-de, we have successfully allocated ALL the items! We are now making preparations for the final touches on these Chinese (Simplified) subtitles.
Hi there 👋
Let's translate the course to zh-CN
so that the whole community can benefit from this resource 🌎!
Below are the subtitles that need review, the existing translation was translated by Google translate - let us know here if you'd like to translate any and we'll add your name to the list. Once you're finished, open a pull request and tag this issue by including #390
in the description.
The recommended PR title is: docs(zh-cn): Reviewed {{TITLE}}
like: docs(zh-cn): Reviewed No.01 - Welcome to the Hugging Face course
Note:
- Please claim one subtitle at a time.
- Considering it will take times to review your PR, please try to 1) Fork this repo 2) Start a new branch to review 3) Open a Pull Request from that branch.
- Let the team know if you have any questions here on the Discord
chinese-simplified-translation
channel.
Subtitles by Chapter
0 - Setup & 1 - Transformer models
- No. 00 - Welcome to the Hugging Face course - by @iCell
- No. 01 - The pipeline function - by @iCell
- No. 02 - The carbon footprint of Transformers - by @iCell
- No. 03 - What is Transfer Learning? - by @iCell
- No. 04 - The Transformer architecture - by @iCell
- No. 05 - Transformer models: Encoders - by @iCell
- No. 06 - Transformer models: Decoders - by @iCell
- No. 07 - Transformer models: Encoder-Decoders - by @iCell
2 - Using 🤗 Transformers
- No. 08 - What happens inside the pipeline function? (PyTorch) - by @innovation64
- No. 09 - What happens inside the pipeline function? (TensorFlow) - by @Vermillion-de
- No. 10 - Instantiate a Transformers model (PyTorch) - by @PowerChina
- No. 11 - Instantiate a Transformers model (TensorFlow) - by @Vermillion-de
- No. 12 - Tokenizers overview - by @Vermillion-de
- No. 13 - Word-based tokenizers - by @Vermillion-de
- No. 14 - Character-based tokenizers - by @Vermillion-de
- No. 15 - Subword-based tokenizers - by @Vermillion-de
- No. 16 - The tokenization pipeline - by @Vermillion-de
- No. 17 - Batching inputs together (PyTorch) - by @Vermillion-de
- No. 18 - Batching inputs together (TensorFlow) - by @Vermillion-de
3 - Fine-tuning a pretrained model
- No. 19 - Hugging Face Datasets overview (Pytorch) - by @jyizheng
- No. 20 - Hugging Face Datasets overview (Tensorflow) - by @Vermillion-de
- No. 21 - Preprocessing sentence pairs (PyTorch) - by @Vermillion-de
- No. 22 - Preprocessing sentence pairs (TensorFlow) - by @Vermillion-de
- No. 23 - What is dynamic padding? - by @nuass
- No. 24 - The Trainer API - by @maybenotime
- No. 25 - Keras introduction - by @txyzy1232006
- No. 26 - Fine-tuning with TensorFlow - by @Vermillion-de
- No. 27 - Learning rate scheduling with TensorFlow - by @Vermillion-de
- No. 28 - TensorFlow Predictions and metrics - by @Vermillion-de
- No. 29 - Write your training loop in PyTorch - by @FYJNEVERFOLLOWS
- No. 30 - Supercharge your PyTorch training loop with Accelerate - by @Vermillion-de
4 - Sharing models and tokenizers
- No. 31 - Navigating the Model Hub - by @tyisme614
- No. 32 - Managing a repo on the Model Hub - by @tyisme614
- No. 33 - The Push to Hub API (PyTorch) - by @tyisme614
- No. 34 - The Push to Hub API (TensorFlow) - by @tyisme614
5 - The 🤗 Datasets library
- No. 35 - Loading a custom dataset - by @tyisme614
- No. 36 - Slice and dice a dataset 🔪 - by @tyisme614
- No. 37 - Datasets + DataFrames = ❤️ - by @tyisme614
- No. 38 - Saving and reloading a dataset - by @tyisme614
- No. 39 - Memory mapping & streaming - by @tyisme614
- No. 40 - Uploading a dataset to the Hub - by @tyisme614
- No. 41 - Text embeddings & semantic search - by @tyisme614
6 - The 🤗 Tokenizers library
- No. 42 - Training a new tokenizer - by @Vermillion-de
- No. 43 - Why are fast tokenizers called fast? - by @Vermillion-de
- No. 44 - Fast tokenizer superpowers - by @Vermillion-de
- No. 45 - Inside the Token classification pipeline (PyTorch) - by @Vermillion-de
- No. 46 - Inside the Token classification pipeline (TensorFlow) - by @Vermillion-de
- No. 47 - Inside the Question answering pipeline (PyTorch) - by @Vermillion-de
- No. 48 - Inside the Question answering pipeline (TensorFlow) - by @Vermillion-de
- No. 49 - What is normalization? - by @Vermillion-de
- No. 50 - What is pre-tokenization? - by @Vermillion-de
- No. 51 - Byte Pair Encoding Tokenization - by @Vermillion-de
- No. 52 - WordPiece Tokenization - by @Vermillion-de
- No. 53 - Unigram Tokenization - by @Vermillion-de
- No. 54 - Building a new tokenizer - by @Vermillion-de
7 - Main NLP tasks (DONE)
- No. 55 - Data processing for Token Classification - by @tyisme614
- No. 56 - Data processing for Masked Language Modeling - by @tyisme614
- No. 57 - What is perplexity? - by @tyisme614
- No. 58 - What is domain adaptation? - by @tyisme614
- No. 59 - Data processing for Translation - by @tyisme614
- No. 60 - What is the BLEU metric? - by @tyisme614
- No. 61 - Data processing for Summarization - by @tyisme614
- No. 62 - What is the ROUGE metric? - by @tyisme614
- No. 63 - Data processing for Causal Language Modeling - by @tyisme614
- No. 64 - Using a custom loss function - by @tyisme614
- No. 65 - Data processing for Question Answering - by @tyisme614
- No. 66 - The Post processing step in Question Answering (PyTorch) - by @tyisme614
- No. 67 - The Post processing step in Question Answering (TensorFlow) - by @tyisme614
- No. 68 - Data Collators: A Tour - by @tyisme614
8 - How to ask for help
- No. 69 - What to do when you get an error? - by @yaoqih
- No. 70 - Using a debugger in a notebook - by @iCell
- No. 71 - Using a debugger in a terminal - by @iCell
- No. 72 - Asking for help on the forums - by @zhangchaosd
- No. 73 - Debugging the Training Pipeline (PyTorch) - by @Vermillion-de
- No. 74 - Debugging the Training Pipeline (TensorFlow) - by @Vermillion-de
- No. 75 - Writing a good issue - by @Vermillion-de