diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml new file mode 100644 index 0000000000..57ef288567 --- /dev/null +++ b/.github/workflows/main.yml @@ -0,0 +1,46 @@ +--- +name: Generate PDF using Pandoc + +# Controls when the workflow will run +on: + # Triggers the workflow on push or pull request events but only for the master branch + push: + branches: [ master ] + + # Allows you to run this workflow manually from the Actions tab + workflow_dispatch: + +# A workflow run is made up of one or more jobs that can run sequentially or in parallel +jobs: + # This workflow contains a single job called "create_pdf" + create_pdf: + # The type of runner that the job will run on + runs-on: ubuntu-20.04 + + # Pandoc Docker image v2.17.0.1 + container: + image: pandoc/latex:2.17.0.1 + + # Steps represent a sequence of tasks that will be executed as part of the job + steps: + # Checks-out your repository under $GITHUB_WORKSPACE, so your job can access it + - uses: actions/checkout@v2 + + # Run the PDF creation script in the container + - uses: docker://pandoc/latex:2.17.0.1 + with: + entrypoint: './pandoc-docker.sh' + + # Create an output with the shortened version of the commit SHA + - id: short_sha + env: + GITHUB_SHA: ${{ github.sha }} + run: echo "::set-output name=short_sha::$(echo ${GITHUB_SHA::7})" + + # Create a release with the PDFs as assets and the shortened SHA as the tag name + - uses: softprops/action-gh-release@v1 + with: + tag_name: v${{ steps.short_sha.outputs.short_sha }} + draft: false + files: | + *.pdf diff --git a/.gitignore b/.gitignore index 5ca2fa242d..77c14a88a3 100644 --- a/.gitignore +++ b/.gitignore @@ -1,5 +1,6 @@ # Byte-compiled / optimized / DLL files *.epub +*.pdf __pycache__/ *.py[cod] @@ -61,4 +62,4 @@ target/ scratch/ # IPython Notebook templates -template.ipynb \ No newline at end of file +template.ipynb diff --git a/README-zh-Hans.md b/README-zh-Hans.md index 047f68132a..5a07c0c25a 100644 --- a/README-zh-Hans.md +++ b/README-zh-Hans.md @@ -1193,7 +1193,7 @@ def get_user(self, user_id): ##### 缓存的缺点: - 请求的数据如果不在缓存中就需要经过三个步骤来获取数据,这会导致明显的延迟。 -- 如果数据库中的数据更新了会导致缓存中的数据过时。这个问题需要通过设置 TTL 强制更新缓存或者直写模式来缓解这种情况。 +- 如果数据库中的数据更新了会导致缓存中的数据过时。这个问题需要通过设置 TTL 强制更新缓存或者直写模式来缓解这种情况。 - 当一个节点出现故障的时候,它将会被一个新的节点替代,这增加了延迟的时间。 #### 直写模式 diff --git a/README.md b/README.md index f009ae8cd7..4ede388b95 100644 --- a/README.md +++ b/README.md @@ -45,10 +45,7 @@ Additional topics for interview prep: ## Anki flashcards -

- -
-

+![Anki flashcards](images/zdCAkB3.png) The provided [Anki flashcard decks](https://apps.ankiweb.net/) use spaced repetition to help you retain key system design concepts. @@ -62,10 +59,7 @@ Great for use while on-the-go. Looking for resources to help you prep for the [**Coding Interview**](https://github.com/donnemartin/interactive-coding-challenges)? -

- -
-

+![Interactive Coding Challenges](images/b4YtAEN.png) Check out the sister repo [**Interactive Coding Challenges**](https://github.com/donnemartin/interactive-coding-challenges), which contains an additional Anki deck: @@ -183,7 +177,7 @@ Review the [Contributing Guidelines](CONTRIBUTING.md). > Suggested topics to review based on your interview timeline (short, medium, long). -![Imgur](images/OfVllex.png) +![Study Guide](images/OfVllex.png) **Q: For interviews, do I need to know everything here?** @@ -306,49 +300,49 @@ Check out the following links to get a better idea of what to expect: [View exercise and solution](solutions/system_design/pastebin/README.md) -![Imgur](images/4edXG0T.png) +![Scaled design of Pastebin.com (or Bit.ly)](images/4edXG0T.png) ### Design the Twitter timeline and search (or Facebook feed and search) [View exercise and solution](solutions/system_design/twitter/README.md) -![Imgur](images/jrUBAF7.png) +![Scaled design of the Twitter timeline and search (or Facebook feed and search)](images/jrUBAF7.png) ### Design a web crawler [View exercise and solution](solutions/system_design/web_crawler/README.md) -![Imgur](images/bWxPtQA.png) +![Scaled design of a web crawler](images/bWxPtQA.png) ### Design Mint.com [View exercise and solution](solutions/system_design/mint/README.md) -![Imgur](images/V5q57vU.png) +![Scaled design of Mint.com](images/V5q57vU.png) ### Design the data structures for a social network [View exercise and solution](solutions/system_design/social_graph/README.md) -![Imgur](images/cdCv5g7.png) +![Scaled design of the data structures for a social network](images/cdCv5g7.png) ### Design a key-value store for a search engine [View exercise and solution](solutions/system_design/query_cache/README.md) -![Imgur](images/4j99mhe.png) +![Scaled design of a key-value store for a search engine](images/4j99mhe.png) ### Design Amazon's sales ranking by category feature [View exercise and solution](solutions/system_design/sales_rank/README.md) -![Imgur](images/MzExP06.png) +![Scaled design of Amazon's sales ranking by category feature](images/MzExP06.png) ### Design a system that scales to millions of users on AWS [View exercise and solution](solutions/system_design/scaling_aws/README.md) -![Imgur](images/jj3A5N8.png) +![Scaled design of a system that scales to millions of users on AWS](images/jj3A5N8.png) ## Object-oriented design interview questions with solutions @@ -439,11 +433,8 @@ Generally, you should aim for **maximal throughput** with **acceptable latency** ### CAP theorem -

- -
- Source: CAP theorem revisited -

+![CAP theorem](images/bgLMI2u.png) +[Source: CAP theorem revisited](https://robertgreiner.com/cap-theorem-revisited) In a distributed computer system, you can only support two of the following guarantees: @@ -580,11 +571,8 @@ If both `Foo` and `Bar` each had 99.9% availability, their total availability in ## Domain name system -

- -
- Source: DNS security presentation -

+![Simple DNS lookup](images/IOyLj4i.jpg) +[Source: DNS security presentation](https://www.slideshare.net/srikrupa5/dns-security-presentation-issa) A Domain Name System (DNS) translates a domain name such as www.example.com to an IP address. @@ -618,11 +606,8 @@ Services such as [CloudFlare](https://www.cloudflare.com/dns/) and [Route 53](ht ## Content delivery network -

- -
- Source: Why use a CDN -

+![Content delivery network](images/h9TAuGI.jpg) +[Source: Why use a CDN](https://www.creative-artworks.eu/why-use-a-content-delivery-network-cdn/) A content delivery network (CDN) is a globally distributed network of proxy servers, serving content from locations closer to the user. Generally, static files such as HTML/CSS/JS, photos, and videos are served from CDN, although some CDNs such as Amazon's CloudFront support dynamic content. The site's DNS resolution will tell clients which server to contact. @@ -659,11 +644,8 @@ Sites with heavy traffic work well with pull CDNs, as traffic is spread out more ## Load balancer -

- -
- Source: Scalable system design patterns -

+![Load balancer pattern](images/h81n9iK.png) +[Source: Scalable system design patterns](https://horicky.blogspot.com/2010/10/scalable-system-design-patterns.html) Load balancers distribute incoming client requests to computing resources such as application servers and databases. In each case, the load balancer returns the response from the computing resource to the appropriate client. Load balancers are effective at: @@ -729,12 +711,8 @@ Load balancers can also help with horizontal scaling, improving performance and ## Reverse proxy (web server) -

- -
- Source: Wikipedia -
-

+![Reverse proxy (web server)](images/n41Azff.png) +[Source: Wikipedia](https://upload.wikimedia.org/wikipedia/commons/6/67/Reverse_proxy_h2g2bob.svg) A reverse proxy is a web server that centralizes internal services and provides unified interfaces to the public. Requests from clients are forwarded to a server that can fulfill it before the reverse proxy returns the server's response to the client. @@ -772,11 +750,8 @@ Additional benefits include: ## Application layer -

- -
- Source: Intro to architecting systems for scale -

+![Load balancing requests](images/yB5SYwm.png) +[Source: Intro to architecting systems for scale](https://lethain.com/introduction-to-architecting-systems-for-scale/#platform_layer) Separating out the web layer from the application layer (also known as platform layer) allows you to scale and configure both layers independently. Adding a new API results in adding application servers without necessarily adding additional web servers. The **single responsibility principle** advocates for small and autonomous services that work together. Small teams with small services can plan more aggressively for rapid growth. @@ -807,11 +782,8 @@ Systems such as [Consul](https://www.consul.io/docs/index.html), [Etcd](https:// ## Database -

- -
- Source: Scaling up to your first 10 million users -

+![Scaled database serving traffic](images/Xkm5CXz.png) +[Source: Scaling up to your first 10 million users](https://www.youtube.com/watch?v=kKjm4ehYiMs) ### Relational database management system (RDBMS) @@ -830,11 +802,8 @@ There are many techniques to scale a relational database: **master-slave replica The master serves reads and writes, replicating writes to one or more slaves, which serve only reads. Slaves can also replicate to additional slaves in a tree-like fashion. If the master goes offline, the system can continue to operate in read-only mode until a slave is promoted to a master or a new master is provisioned. -

- -
- Source: Scalability, availability, stability, patterns -

+![Master-slave replication](images/C9ioGtn.png) +[Source: Scalability, availability, stability, patterns](https://www.slideshare.net/jboner/scalability-availability-stability-patterns) ##### Disadvantage(s): master-slave replication @@ -845,11 +814,8 @@ The master serves reads and writes, replicating writes to one or more slaves, wh Both masters serve reads and writes and coordinate with each other on writes. If either master goes down, the system can continue to operate with both reads and writes. -

- -
- Source: Scalability, availability, stability, patterns -

+![Master-master replication](images/krAHLGg.png) +[Source: Scalability, availability, stability, patterns](https://www.slideshare.net/jboner/scalability-availability-stability-patterns) ##### Disadvantage(s): master-master replication @@ -873,11 +839,8 @@ Both masters serve reads and writes and coordinate with each other on writes. I #### Federation -

- -
- Source: Scaling up to your first 10 million users -

+![Functional partitioning (federation)](images/U3qV33e.png) +[Source: Scaling up to your first 10 million users](https://www.youtube.com/watch?v=kKjm4ehYiMs) Federation (or functional partitioning) splits up databases by function. For example, instead of a single, monolithic database, you could have three databases: **forums**, **users**, and **products**, resulting in less read and write traffic to each database and therefore less replication lag. Smaller databases result in more data that can fit in memory, which in turn results in more cache hits due to improved cache locality. With no single central master serializing writes you can write in parallel, increasing throughput. @@ -894,11 +857,8 @@ Federation (or functional partitioning) splits up databases by function. For ex #### Sharding -

- -
- Source: Scalability, availability, stability, patterns -

+![Sharding](images/wU8x5Id.png) +[Source: Scalability, availability, stability, patterns](https://www.slideshare.net/jboner/scalability-availability-stability-patterns) Sharding distributes data across different databases such that each database can only manage a subset of the data. Taking a users database as an example, as the number of users increases, more shards are added to the cluster. @@ -1038,11 +998,8 @@ Document stores provide high flexibility and are often used for working with occ #### Wide column store -

- -
- Source: SQL & NoSQL, a brief history -

+![Wide column store](images/n16iOGk.png) +[Source: SQL & NoSQL, a brief history](https://blog.grio.com/2015/11/sql-nosql-a-brief-history.html) > Abstraction: nested map `ColumnFamily>` @@ -1061,11 +1018,8 @@ Wide column stores offer high availability and high scalability. They are often #### Graph database -

- -
- Source: Graph database -

+![Graph database](images/fNcl65g.png) +[Source: Graph database](https://en.wikipedia.org/wiki/File:GraphDatabase_PropertyGraph.png) > Abstraction: graph @@ -1089,11 +1043,8 @@ Graphs databases offer high performance for data models with complex relationshi ### SQL or NoSQL -

- -
- Source: Transitioning from RDBMS to NoSQL -

+![SQL or NoSQL](images/wXGqG5f.png) +[Source: Transitioning from RDBMS to NoSQL](https://www.infoq.com/articles/Transition-RDBMS-NoSQL) Reasons for **SQL**: @@ -1131,11 +1082,8 @@ Sample data well-suited for NoSQL: ## Cache -

- -
- Source: Scalable system design patterns -

+![Cache](images/Q6z24La.png) +[Source: Scalable system design patterns](https://horicky.blogspot.com/2010/10/scalable-system-design-patterns.html) Caching improves page load times and can reduce the load on your servers and databases. In this model, the dispatcher will first lookup if the request has been made before and try to find the previous result to return, in order to save the actual execution. @@ -1202,11 +1150,8 @@ Since you can only store a limited amount of data in cache, you'll need to deter #### Cache-aside -

- -
- Source: From cache to in-memory data grid -

+![Cache-aside](images/ONjORqk.png) +[Source: From cache to in-memory data grid](https://www.slideshare.net/tmatyashovsky/from-cache-to-in-memory-data-grid-introduction-to-hazelcast) The application is responsible for reading and writing from storage. The cache does not interact with storage directly. The application does the following: @@ -1238,11 +1183,8 @@ Subsequent reads of data added to cache are fast. Cache-aside is also referred #### Write-through -

- -
- Source: Scalability, availability, stability, patterns -

+![Write-through](images/0vBc0hN.png) +[Source: Scalability, availability, stability, patterns](https://www.slideshare.net/jboner/scalability-availability-stability-patterns) The application uses the cache as the main data store, reading and writing data to it, while the cache is responsible for reading and writing to the database: @@ -1273,11 +1215,8 @@ Write-through is a slow overall operation due to the write operation, but subseq #### Write-behind (write-back) -

- -
- Source: Scalability, availability, stability, patterns -

+![Write-behind (write-back)](images/rgSrvjG.png) +[Source: Scalability, availability, stability, patterns](https://www.slideshare.net/jboner/scalability-availability-stability-patterns) In write-behind, the application does the following: @@ -1291,11 +1230,8 @@ In write-behind, the application does the following: #### Refresh-ahead -

- -
- Source: From cache to in-memory data grid -

+![Refresh-ahead](images/kxtjqgE.png) +[Source: From cache to in-memory data grid](https://www.slideshare.net/tmatyashovsky/from-cache-to-in-memory-data-grid-introduction-to-hazelcast) You can configure the cache to automatically refresh any recently accessed cache entry prior to its expiration. @@ -1323,11 +1259,8 @@ Refresh-ahead can result in reduced latency vs read-through if the cache can acc ## Asynchronism -

- -
- Source: Intro to architecting systems for scale -

+![Asynchronism](images/54GYsSx.png) +[Source: Intro to architecting systems for scale](https://lethain.com/introduction-to-architecting-systems-for-scale/#platform_layer) Asynchronous workflows help reduce request times for expensive operations that would otherwise be performed in-line. They can also help by doing time-consuming work in advance, such as periodic aggregation of data. @@ -1369,11 +1302,8 @@ If queues start to grow significantly, the queue size can become larger than mem ## Communication -

- -
- Source: OSI 7 layer model -

+![OSI 7 Layer Model](images/5KeocQs.jpg) +[Source: OSI 7 layer model](https://www.escotal.com/osilayer.html) ### Hypertext transfer protocol (HTTP) @@ -1401,11 +1331,8 @@ HTTP is an application layer protocol relying on lower-level protocols such as * ### Transmission control protocol (TCP) -

- -
- Source: How to make a multiplayer game -

+![TCP](images/JdAsdvG.jpg) +[Source: How to make a multiplayer game](http://www.wildbunny.co.uk/blog/2012/10/09/how-to-make-a-multi-player-game-part-1) TCP is a connection-oriented protocol over an [IP network](https://en.wikipedia.org/wiki/Internet_Protocol). Connection is established and terminated using a [handshake](https://en.wikipedia.org/wiki/Handshaking). All packets sent are guaranteed to reach the destination in the original order and without corruption through: @@ -1425,11 +1352,8 @@ Use TCP over UDP when: ### User datagram protocol (UDP) -

- -
- Source: How to make a multiplayer game -

+![UDP](images/yzDrJtA.jpg) +[Source: How to make a multiplayer game](http://www.wildbunny.co.uk/blog/2012/10/09/how-to-make-a-multi-player-game-part-1) UDP is connectionless. Datagrams (analogous to packets) are guaranteed only at the datagram level. Datagrams might reach their destination out of order or not at all. UDP does not support congestion control. Without the guarantees that TCP support, UDP is generally more efficient. @@ -1454,11 +1378,8 @@ Use UDP over TCP when: ### Remote procedure call (RPC) -

- -
- Source: Crack the system design interview -

+![RPC](images/iF4Mkb5.png) +[Source: Crack the system design interview](https://www.puncsky.com/blog/2016-02-13-crack-the-system-design-interview) In an RPC, a client causes a procedure to execute on a different address space, usually a remote server. The procedure is coded as if it were a local procedure call, abstracting away the details of how to communicate with the server from the client program. Remote calls are usually slower and less reliable than local calls so it is helpful to distinguish RPC calls from local calls. Popular RPC frameworks include [Protobuf](https://developers.google.com/protocol-buffers/), [Thrift](https://thrift.apache.org/), and [Avro](https://avro.apache.org/docs/current/). @@ -1680,11 +1601,8 @@ Handy metrics based on numbers above: > Articles on how real world systems are designed. -

- -
- Source: Twitter timelines at scale -

+![Twitter timeline scalability](images/TcUo2fw.png) +[Source: Twitter timelines at scale](https://www.infoq.com/presentations/Twitter-Timeline-Scalability) **Don't focus on nitty gritty details for the following articles, instead:** diff --git a/deeplists.tex b/deeplists.tex new file mode 100644 index 0000000000..a35480a6cd --- /dev/null +++ b/deeplists.tex @@ -0,0 +1,24 @@ + \usepackage{enumitem} + \setlistdepth{9} + + \setlist[itemize,1]{label=$\bullet$} + \setlist[itemize,2]{label=$\bullet$} + \setlist[itemize,3]{label=$\bullet$} + \setlist[itemize,4]{label=$\bullet$} + \setlist[itemize,5]{label=$\bullet$} + \setlist[itemize,6]{label=$\bullet$} + \setlist[itemize,7]{label=$\bullet$} + \setlist[itemize,8]{label=$\bullet$} + \setlist[itemize,9]{label=$\bullet$} + \renewlist{itemize}{itemize}{9} + + \setlist[enumerate,1]{label=$\arabic*.$} + \setlist[enumerate,2]{label=$\alph*.$} + \setlist[enumerate,3]{label=$\roman*.$} + \setlist[enumerate,4]{label=$\arabic*.$} + \setlist[enumerate,5]{label=$\alpha*$} + \setlist[enumerate,6]{label=$\roman*.$} + \setlist[enumerate,7]{label=$\arabic*.$} + \setlist[enumerate,8]{label=$\alph*.$} + \setlist[enumerate,9]{label=$\roman*.$} + \renewlist{enumerate}{enumerate}{9} diff --git a/generate-epub.sh b/generate-epub.sh index 18690fbb52..a6bfe05b50 100755 --- a/generate-epub.sh +++ b/generate-epub.sh @@ -34,7 +34,7 @@ generate () { cat $name.md | generate_from_stdin $name.epub $language } -# Check if depencies exist +# Check if dependencies exist check_dependencies () { for dependency in "${dependencies[@]}" do diff --git a/images/3X8nmdL.png b/images/3X8nmdL.png new file mode 100644 index 0000000000..121363731d Binary files /dev/null and b/images/3X8nmdL.png differ diff --git a/images/48tEA2j.png b/images/48tEA2j.png new file mode 100644 index 0000000000..637298f9dc Binary files /dev/null and b/images/48tEA2j.png differ diff --git a/images/B8LDKD7.png b/images/B8LDKD7.png new file mode 100644 index 0000000000..22b1764d36 Binary files /dev/null and b/images/B8LDKD7.png differ diff --git a/images/BKsBnmG.png b/images/BKsBnmG.png new file mode 100644 index 0000000000..ab0ad6f5c9 Binary files /dev/null and b/images/BKsBnmG.png differ diff --git a/images/E8klrBh.png b/images/E8klrBh.png new file mode 100644 index 0000000000..9c466293b9 Binary files /dev/null and b/images/E8klrBh.png differ diff --git a/images/KqZ3dSx.png b/images/KqZ3dSx.png new file mode 100644 index 0000000000..cfe97fc3ae Binary files /dev/null and b/images/KqZ3dSx.png differ diff --git a/images/OZCxJr0.png b/images/OZCxJr0.png new file mode 100644 index 0000000000..36f9db8000 Binary files /dev/null and b/images/OZCxJr0.png differ diff --git a/images/raoFTXM.png b/images/raoFTXM.png new file mode 100644 index 0000000000..b9632913f6 Binary files /dev/null and b/images/raoFTXM.png differ diff --git a/images/rrfjMXB.png b/images/rrfjMXB.png new file mode 100644 index 0000000000..3cb8eca187 Binary files /dev/null and b/images/rrfjMXB.png differ diff --git a/images/vwMa1Qu.png b/images/vwMa1Qu.png new file mode 100644 index 0000000000..bcde141173 Binary files /dev/null and b/images/vwMa1Qu.png differ diff --git a/images/wxXyq2J.png b/images/wxXyq2J.png new file mode 100644 index 0000000000..6f60c3d9a5 Binary files /dev/null and b/images/wxXyq2J.png differ diff --git a/images/xjdAAUv.png b/images/xjdAAUv.png new file mode 100644 index 0000000000..97632ac569 Binary files /dev/null and b/images/xjdAAUv.png differ diff --git a/pandoc-docker.sh b/pandoc-docker.sh new file mode 100755 index 0000000000..4bddf57075 --- /dev/null +++ b/pandoc-docker.sh @@ -0,0 +1,15 @@ +#! /bin/sh + +# Install dependencies for successful PDF generations +tlmgr install ctex enumitem float koma-script + +# Generate PDFs using pandoc +for filename in pandoc-*yaml; do + # Create variable for language based on filename + language=`echo $filename | cut -d'.' -f1 | cut -d'-' -f2-3` + + # Attempt to create the PDF + echo "Generating ${language} PDF with solutions..." + pandoc -d ${filename} + [[ $? -eq 0 ]] && echo "Success! The ${language} PDF has been successfully created!" +done diff --git a/pandoc-en-US.yaml b/pandoc-en-US.yaml new file mode 100644 index 0000000000..dbdf1d5533 --- /dev/null +++ b/pandoc-en-US.yaml @@ -0,0 +1,38 @@ +--- +metadata: + title: System Design Primer + subtitle: Learn how to design large-scale systems. Prep for the system design interview. + author: Donne Martin + category: "License: Creative Commons Attribution 4.0 International License" + keywords: + - "system" + - "design" + - "primer" +standalone: true +variables: + documentclass: scrbook + lang: en-US + links-as-notes: true + lot: false + lof: false + margin-top: 1.27cm + margin-left: .635cm + margin-right: .635cm + margin-bottom: 1.27cm +table-of-contents: true +toc-depth: 2 +include-in-header: deeplists.tex +verbosity: ERROR +pdf-engine: lualatex +input-files: + - ./README.md + - ./solutions/system_design/social_graph/README.md + - ./solutions/system_design/web_crawler/README.md + - ./solutions/system_design/scaling_aws/README.md + - ./solutions/system_design/pastebin/README.md + - ./solutions/system_design/sales_rank/README.md + - ./solutions/system_design/twitter/README.md + - ./solutions/system_design/mint/README.md + - ./solutions/system_design/query_cache/README.md +output-file: system-design-primer.pdf +... diff --git a/pandoc-zh-CN.yaml b/pandoc-zh-CN.yaml new file mode 100644 index 0000000000..38afb7d05a --- /dev/null +++ b/pandoc-zh-CN.yaml @@ -0,0 +1,34 @@ +--- +metadata: + title: 系统设计入门 + subtitle: 学习如何设计大型系统。为系统设计的面试做准备。 + author: Donne Martin + category: "License: Creative Commons Attribution 4.0 International License" +standalone: true +variables: + documentclass: scrbook + lang: zh + links-as-notes: true + lot: false + lof: false + margin-top: 1.27cm + margin-left: .635cm + margin-right: .635cm + margin-bottom: 1.27cm +table-of-contents: true +toc-depth: 2 +include-in-header: deeplists.tex +verbosity: ERROR +pdf-engine: xelatex +input-files: + - ./README-zh-Hans.md + - ./solutions/system_design/social_graph/README-zh-Hans.md + - ./solutions/system_design/web_crawler/README-zh-Hans.md + - ./solutions/system_design/scaling_aws/README-zh-Hans.md + - ./solutions/system_design/pastebin/README-zh-Hans.md + - ./solutions/system_design/sales_rank/README-zh-Hans.md + - ./solutions/system_design/twitter/README-zh-Hans.md + - ./solutions/system_design/mint/README-zh-Hans.md + - ./solutions/system_design/query_cache/README-zh-Hans.md +output-file: system-design-primer-zh.pdf +... diff --git a/pandoc-zh-Hans.yaml b/pandoc-zh-Hans.yaml new file mode 100644 index 0000000000..b03ac31a96 --- /dev/null +++ b/pandoc-zh-Hans.yaml @@ -0,0 +1,34 @@ +--- +metadata: + title: 系统设计入门 + subtitle: 学习如何设计大型系统。为系统设计的面试做准备。 + author: Donne Martin + category: "License: Creative Commons Attribution 4.0 International License" +standalone: true +variables: + documentclass: scrbook + lang: zh-Hans + links-as-notes: true + lot: false + lof: false + margin-top: 1.27cm + margin-left: .635cm + margin-right: .635cm + margin-bottom: 1.27cm +table-of-contents: true +toc-depth: 2 +include-in-header: deeplists.tex +verbosity: ERROR +pdf-engine: xelatex +input-files: + - ./README-zh-Hans.md + - ./solutions/system_design/social_graph/README-zh-Hans.md + - ./solutions/system_design/web_crawler/README-zh-Hans.md + - ./solutions/system_design/scaling_aws/README-zh-Hans.md + - ./solutions/system_design/pastebin/README-zh-Hans.md + - ./solutions/system_design/sales_rank/README-zh-Hans.md + - ./solutions/system_design/twitter/README-zh-Hans.md + - ./solutions/system_design/mint/README-zh-Hans.md + - ./solutions/system_design/query_cache/README-zh-Hans.md +output-file: system-design-primer-zh.pdf +... diff --git a/pandoc.sh b/pandoc.sh new file mode 100755 index 0000000000..9ccdea5757 --- /dev/null +++ b/pandoc.sh @@ -0,0 +1,31 @@ +#! /bin/bash + +# Generate PDFs using pandoc +generate_pdfs_with_solutions() { + for filename in pandoc-*yaml; do + # Create variable for language based on filename + IFS=- read _ language <<< "${filename}" + language=${language/.yaml/} + + # Attempt to create the PDF + echo "Generating ${language} PDF with solutions..." + pandoc -d ${filename} + [[ $? -eq 0 ]] && echo "Success! The ${language} PDF has been successfully created!" + done +} + +# Check if dependencies exist +check_dependencies () { + for dependency in "${dependencies[@]}" + do + if ! [ -x "$(command -v $dependency)" ]; then + echo "Error: $dependency is not installed." >&2 + exit 1 + fi + done +} + +dependencies=("pandoc" "tex") + +check_dependencies +generate_pdfs_with_solutions diff --git a/solutions/system_design/mint/README.md b/solutions/system_design/mint/README.md index 1ec31674db..8401a8cca6 100644 --- a/solutions/system_design/mint/README.md +++ b/solutions/system_design/mint/README.md @@ -80,7 +80,7 @@ Handy conversion guide: > Outline a high level design with all important components. -![Imgur](http://i.imgur.com/E8klrBh.png) +![High level design of Mint.com](https://i.imgur.com/E8klrBh.png) ## Step 3: Design core components @@ -327,7 +327,7 @@ class SpendingByCategory(MRJob): > Identify and address bottlenecks, given the constraints. -![Imgur](http://i.imgur.com/V5q57vU.png) +![Scaled design of Mint.com](https://i.imgur.com/V5q57vU.png) **Important: Do not simply jump right into the final design from the initial design!** diff --git a/solutions/system_design/pastebin/README.md b/solutions/system_design/pastebin/README.md index 2d87ddcc7e..21767f0031 100644 --- a/solutions/system_design/pastebin/README.md +++ b/solutions/system_design/pastebin/README.md @@ -79,7 +79,7 @@ Handy conversion guide: > Outline a high level design with all important components. -![Imgur](http://i.imgur.com/BKsBnmG.png) +![High level design of Pastebin.com (or Bit.ly)](https://i.imgur.com/BKsBnmG.png) ## Step 3: Design core components @@ -235,7 +235,7 @@ To delete expired pastes, we could just scan the **SQL Database** for all entrie > Identify and address bottlenecks, given the constraints. -![Imgur](http://i.imgur.com/4edXG0T.png) +![Scaled design of Pastebin.com (or Bit.ly)](https://i.imgur.com/4edXG0T.png) **Important: Do not simply jump right into the final design from the initial design!** diff --git a/solutions/system_design/query_cache/README.md b/solutions/system_design/query_cache/README.md index 032adf34ab..b99f6033d6 100644 --- a/solutions/system_design/query_cache/README.md +++ b/solutions/system_design/query_cache/README.md @@ -58,7 +58,7 @@ Handy conversion guide: > Outline a high level design with all important components. -![Imgur](http://i.imgur.com/KqZ3dSx.png) +![High level of a key-value cache to save the results of the most recent web server queries](https://i.imgur.com/KqZ3dSx.png) ## Step 3: Design core components @@ -212,7 +212,7 @@ Refer to [When to update the cache](https://github.com/donnemartin/system-design > Identify and address bottlenecks, given the constraints. -![Imgur](http://i.imgur.com/4j99mhe.png) +![Scaled design of a key-value store for a search engine](https://i.imgur.com/4j99mhe.png) **Important: Do not simply jump right into the final design from the initial design!** diff --git a/solutions/system_design/sales_rank/README.md b/solutions/system_design/sales_rank/README.md index 71ad1c7d20..d27a0e33b5 100644 --- a/solutions/system_design/sales_rank/README.md +++ b/solutions/system_design/sales_rank/README.md @@ -70,7 +70,7 @@ Handy conversion guide: > Outline a high level design with all important components. -![Imgur](http://i.imgur.com/vwMa1Qu.png) +![High level design of Amazon's sales ranking by category feature](https://i.imgur.com/vwMa1Qu.png) ## Step 3: Design core components @@ -239,7 +239,7 @@ For internal communications, we could use [Remote Procedure Calls](https://githu > Identify and address bottlenecks, given the constraints. -![Imgur](http://i.imgur.com/MzExP06.png) +![Scaled design of Amazon's sales ranking by category feature](https://i.imgur.com/MzExP06.png) **Important: Do not simply jump right into the final design from the initial design!** diff --git a/solutions/system_design/scaling_aws/README.md b/solutions/system_design/scaling_aws/README.md index 99af0cfff8..a5f46f38ff 100644 --- a/solutions/system_design/scaling_aws/README.md +++ b/solutions/system_design/scaling_aws/README.md @@ -64,7 +64,7 @@ Handy conversion guide: > Outline a high level design with all important components. -![Imgur](http://i.imgur.com/B8LDKD7.png) +![High level design of an AWS service](https://i.imgur.com/B8LDKD7.png) ## Step 3: Design core components @@ -139,7 +139,7 @@ Add a **DNS** such as Route 53 to map the domain to the instance's public IP. ### Users+ -![Imgur](http://i.imgur.com/rrfjMXB.png) +![Scaled design of an AWS service to lighten load on a single box and allow for independent scaling](https://i.imgur.com/rrfjMXB.png) #### Assumptions @@ -191,7 +191,7 @@ We've been able to address these issues with **Vertical Scaling** so far. Unfor ### Users++ -![Imgur](http://i.imgur.com/raoFTXM.png) +![Scaled design of an AWS service to address web server scaling](https://i.imgur.com/raoFTXM.png) #### Assumptions @@ -220,7 +220,7 @@ Our **Benchmarks/Load Tests** and **Profiling** show that our single **Web Serve ### Users+++ -![Imgur](http://i.imgur.com/OZCxJr0.png) +![Scaled design of an AWS service to address MySQL scaling](https://i.imgur.com/OZCxJr0.png) **Note:** **Internal Load Balancers** not shown to reduce clutter @@ -258,7 +258,7 @@ Our **Benchmarks/Load Tests** and **Profiling** show that we are read-heavy (100 ### Users++++ -![Imgur](http://i.imgur.com/3X8nmdL.png) +![Scaled design of an AWS service with autoscaling added](https://i.imgur.com/3X8nmdL.png) #### Assumptions @@ -297,7 +297,7 @@ Our **Benchmarks/Load Tests** and **Profiling** show that our traffic spikes dur ### Users+++++ -![Imgur](http://i.imgur.com/jj3A5N8.png) +![Scaled design of a system that scales to millions of users on AWS](https://i.imgur.com/jj3A5N8.png) **Note:** **Autoscaling** groups not shown to reduce clutter diff --git a/solutions/system_design/social_graph/README.md b/solutions/system_design/social_graph/README.md index f7dfd4efe8..5d73eefc18 100644 --- a/solutions/system_design/social_graph/README.md +++ b/solutions/system_design/social_graph/README.md @@ -50,7 +50,7 @@ Handy conversion guide: > Outline a high level design with all important components. -![Imgur](http://i.imgur.com/wxXyq2J.png) +![High level design of the data structures for a social network](https://i.imgur.com/wxXyq2J.png) ## Step 3: Design core components @@ -250,7 +250,7 @@ For internal communications, we could use [Remote Procedure Calls](https://githu > Identify and address bottlenecks, given the constraints. -![Imgur](http://i.imgur.com/cdCv5g7.png) +![Scaled design of the data structures for a social network](https://i.imgur.com/cdCv5g7.png) **Important: Do not simply jump right into the final design from the initial design!** diff --git a/solutions/system_design/twitter/README.md b/solutions/system_design/twitter/README.md index d14996f152..451b70469c 100644 --- a/solutions/system_design/twitter/README.md +++ b/solutions/system_design/twitter/README.md @@ -93,7 +93,7 @@ Handy conversion guide: > Outline a high level design with all important components. -![Imgur](http://i.imgur.com/48tEA2j.png) +![High level design of the Twitter timeline and search (or Facebook feed and search)](https://i.imgur.com/48tEA2j.png) ## Step 3: Design core components @@ -223,7 +223,7 @@ The response would be similar to that of the home timeline, except for tweets ma > Identify and address bottlenecks, given the constraints. -![Imgur](http://i.imgur.com/jrUBAF7.png) +![Scaled design of the Twitter timeline and search (or Facebook feed and search)](https://i.imgur.com/jrUBAF7.png) **Important: Do not simply jump right into the final design from the initial design!** diff --git a/solutions/system_design/web_crawler/README.md b/solutions/system_design/web_crawler/README.md index e6e79ad224..bd75eae462 100644 --- a/solutions/system_design/web_crawler/README.md +++ b/solutions/system_design/web_crawler/README.md @@ -69,7 +69,7 @@ Handy conversion guide: > Outline a high level design with all important components. -![Imgur](http://i.imgur.com/xjdAAUv.png) +![High level design of a web crawler](https://i.imgur.com/xjdAAUv.png) ## Step 3: Design core components @@ -256,7 +256,7 @@ For internal communications, we could use [Remote Procedure Calls](https://githu > Identify and address bottlenecks, given the constraints. -![Imgur](http://i.imgur.com/bWxPtQA.png) +![Scaled design of a web crawler](https://i.imgur.com/bWxPtQA.png) **Important: Do not simply jump right into the final design from the initial design!**