|
138 | 138 | <p class=MsoNormal><span style='font-size:14.0pt;line-height:115%'>Though a |
139 | 139 | 2-billion-parameter model is small, it still requires 4 to 8 GB of memory.</span></p> |
140 | 140 |
|
141 | | -<p class=MsoNormal><span style='font-size:14.0pt;line-height:115%'> </span></p> |
142 | | - |
143 | | -<p class=MsoNormal><span style='font-size:14.0pt;line-height:115%'>1. Using |
144 | | -float32 (4 bytes per parameter):</span></p> |
145 | | - |
146 | | -<p class=MsoNormal><span style='font-size:14.0pt;line-height:115%'>2,000,000,000 |
147 | | -parameters × 4 bytes = 8,000,000,000 bytes = <b>8 GB</b></span></p> |
148 | | - |
149 | | -<p class=MsoNormal><span style='font-size:14.0pt;line-height:115%'> </span></p> |
150 | | - |
151 | | -<p class=MsoNormal><span style='font-size:14.0pt;line-height:115%'>2. Using |
152 | | -float16 (2 bytes per parameter): </span></p> |
153 | | - |
154 | | -<p class=MsoNormal><span style='font-size:14.0pt;line-height:115%'>2,000,000,000 |
155 | | -× 2 bytes = 4,000,000,000 bytes = <b>4 GB</b></span></p> |
156 | | - |
157 | | -<p class=MsoNormal><span style='font-size:14.0pt;line-height:115%'> </span></p> |
| 141 | +<p class=MsoNormal><img border=0 width=1099 height=239 |
| 142 | +src="doc_files/image001.png"></p> |
158 | 143 |
|
159 | 144 | <p class=MsoNormal><span style='font-size:14.0pt;line-height:115%'>In order to |
160 | 145 | run bigger models on our machines, we use a solution called <b>quantization</b>, |
|
192 | 177 | <p class=MsoNormal><span style='font-size:14.0pt;line-height:115%'>After |
193 | 178 | quantizing a 2-billion-parameter model (using integers):</span></p> |
194 | 179 |
|
195 | | -<p class=MsoNormal><b><span style='font-size:14.0pt;line-height:115%'>Using |
196 | | -int8 (1 byte):</span></b></p> |
197 | | - |
198 | | -<ul style='margin-top:0in' type=disc> |
199 | | - <li class=MsoNormal><span style='font-size:14.0pt;line-height:115%'>2B × 1 |
200 | | - byte = <b>2 GB</b></span></li> |
201 | | -</ul> |
202 | | - |
203 | | -<p class=MsoNormal><b><span style='font-size:14.0pt;line-height:115%'>Using |
204 | | -int4 (0.5 byte):</span></b></p> |
| 180 | +<p class=MsoNormal><img border=0 width=928 height=180 |
| 181 | +src="doc_files/image002.png"></p> |
205 | 182 |
|
206 | 183 | <ul style='margin-top:0in' type=disc> |
207 | | - <li class=MsoNormal><span style='font-size:14.0pt;line-height:115%'>2B × 0.5 |
208 | | - byte = <b>1 GB</b></span></li> |
209 | 184 | <li class=MsoNormal><span style='font-size:14.0pt;line-height:115%'> </span></li> |
210 | 185 | </ul> |
211 | 186 |
|
212 | 187 | <p class=MsoNormal><img border=0 width=1467 height=744 |
213 | | -src="doc_files/image001.jpg"></p> |
| 188 | +src="doc_files/image003.jpg"></p> |
214 | 189 |
|
215 | 190 | <p class=MsoNormal><span style='font-size:14.0pt;line-height:115%'> </span></p> |
216 | 191 |
|
|
227 | 202 | <p class=MsoNormal><span style='font-size:14.0pt;line-height:115%'> </span></p> |
228 | 203 |
|
229 | 204 | <p class=MsoNormal><img border=0 width=1465 height=655 |
230 | | -src="doc_files/image002.jpg"></p> |
| 205 | +src="doc_files/image004.jpg"></p> |
231 | 206 |
|
232 | 207 | <p class=MsoNormal><span style='font-size:14.0pt;line-height:115%'>Figure 2</span></p> |
233 | 208 |
|
|
290 | 265 | locally.</span></p> |
291 | 266 | <p class=MsoNormal><span style='font-size:14.0pt;line-height:115%'> </span></p> |
292 | 267 | <p class=MsoNormal><img border=0 width=1410 height=553 |
293 | | - src="doc_files/image003.jpg"></p> |
| 268 | + src="doc_files/image005.jpg"></p> |
294 | 269 | </td> |
295 | 270 | <td style='padding:.75pt .75pt .75pt .75pt'> |
296 | 271 | <p class=MsoNormal><span style='font-size:14.0pt;line-height:115%'> </span></p> |
|
310 | 285 | versions that we can run locally.</span></p> |
311 | 286 | <p class=MsoNormal><span style='font-size:14.0pt;line-height:115%'> </span></p> |
312 | 287 | <p class=MsoNormal><img border=0 width=1304 height=788 |
313 | | - src="doc_files/image004.png"></p> |
| 288 | + src="doc_files/image006.png"></p> |
314 | 289 | </td> |
315 | 290 | <td style='padding:.75pt .75pt .75pt .75pt'> |
316 | 291 | <p class=MsoNormal><span style='font-size:14.0pt;line-height:115%'> </span></p> |
|
327 | 302 | have the same number of parameters. </span></p> |
328 | 303 | <p class=MsoNormal><span style='font-size:14.0pt;line-height:115%'> </span></p> |
329 | 304 | <p class=MsoNormal><img border=0 width=1360 height=562 |
330 | | - src="doc_files/image005.jpg"></p> |
| 305 | + src="doc_files/image007.jpg"></p> |
331 | 306 | <p class=MsoNormal><span style='font-size:14.0pt;line-height:115%'> </span></p> |
332 | 307 | <p class=MsoNormal><span style='font-size:14.0pt;line-height:115%'>Figure 5</span></p> |
333 | 308 | <p class=MsoNormal><span style='font-size:14.0pt;line-height:115%'>Originally, |
|
338 | 313 | associated with quantized models.</span></p> |
339 | 314 | <p class=MsoNormal><span style='font-size:14.0pt;line-height:115%'>We should |
340 | 315 | have a GPU capable of running the model, along with enough VRAM to load the |
341 | | - quantized parameters. However, if we dont have a suitable GPU, that's okay</span></p> |
| 316 | + quantized parameters. However, if we don't have a suitable GPU, that's okay</span></p> |
342 | 317 | <p class=MsoNormal><span style='font-size:14.0pt;line-height:115%'>in this |
343 | 318 | case, the CPU and regular system RAM can also handle the model, though it |
344 | 319 | will run more slowly.</span></p> |
|
348 | 323 | hardware profile.</span></p> |
349 | 324 | <p class=MsoNormal><span style='font-size:14.0pt;line-height:115%'> </span></p> |
350 | 325 | <p class=MsoNormal><img border=0 width=1400 height=681 |
351 | | - src="doc_files/image006.jpg"></p> |
| 326 | + src="doc_files/image008.jpg"></p> |
352 | 327 | </td> |
353 | 328 | <td style='padding:.75pt .75pt .75pt .75pt'> |
354 | 329 | <p class=MsoNormal><span style='font-size:14.0pt;line-height:115%'> </span></p> |
|
364 | 339 | Settings.</span></p> |
365 | 340 | <p class=MsoNormal><span style='font-size:14.0pt;line-height:115%'> </span></p> |
366 | 341 | <p class=MsoNormal><img border=0 width=1496 height=797 |
367 | | - src="doc_files/image007.jpg"></p> |
| 342 | + src="doc_files/image009.jpg"></p> |
368 | 343 | </td> |
369 | 344 | <td style='padding:.75pt .75pt .75pt .75pt'> |
370 | 345 | <p class=MsoNormal><span style='font-size:14.0pt;line-height:115%'> </span></p> |
|
396 | 371 | end.</span></p> |
397 | 372 | <p class=MsoNormal><span style='font-size:14.0pt;line-height:115%'> </span></p> |
398 | 373 | <p class=MsoNormal><img border=0 width=1484 height=618 |
399 | | - src="doc_files/image008.jpg"></p> |
| 374 | + src="doc_files/image010.jpg"></p> |
400 | 375 | </td> |
401 | 376 | <td style='padding:.75pt .75pt .75pt .75pt'> |
402 | 377 | <p class=MsoNormal><span style='font-size:14.0pt;line-height:115%'> </span></p> |
|
412 | 387 | back to the quantized model, you'll see the hardware profile I shared.</span></p> |
413 | 388 | <p class=MsoNormal><span style='font-size:14.0pt;line-height:115%'> </span></p> |
414 | 389 | <p class=MsoNormal><img border=0 width=1409 height=442 |
415 | | - src="doc_files/image009.jpg"></p> |
| 390 | + src="doc_files/image011.jpg"></p> |
416 | 391 | </td> |
417 | 392 | <td style='padding:.75pt .75pt .75pt .75pt'> |
418 | 393 | <p class=MsoNormal><span style='font-size:14.0pt;line-height:115%'> </span></p> |
|
442 | 417 | <p class=MsoNormal><span style='font-size:14.0pt;line-height:115%'> </span></p> |
443 | 418 |
|
444 | 419 | <p class=MsoNormal><img border=0 width=1416 height=562 |
445 | | -src="doc_files/image010.jpg"></p> |
| 420 | +src="doc_files/image012.jpg"></p> |
446 | 421 |
|
447 | 422 | <p class=MsoNormal><span style='font-size:14.0pt;line-height:115%'>Figure 10</span></p> |
448 | 423 |
|
|
454 | 429 | <p class=MsoNormal><span style='font-size:14.0pt;line-height:115%'> </span></p> |
455 | 430 |
|
456 | 431 | <p class=MsoNormal><img border=0 width=1434 height=595 |
457 | | -src="doc_files/image011.jpg"></p> |
| 432 | +src="doc_files/image013.jpg"></p> |
458 | 433 |
|
459 | 434 | <p class=MsoNormal><span style='font-size:14.0pt;line-height:115%'>Figure 11</span></p> |
460 | 435 |
|
|
465 | 440 | <p class=MsoNormal><span style='font-size:14.0pt;line-height:115%'> </span></p> |
466 | 441 |
|
467 | 442 | <p class=MsoNormal><img border=0 width=1402 height=621 id="Picture 1" |
468 | | -src="doc_files/image012.jpg"></p> |
| 443 | +src="doc_files/image014.jpg"></p> |
469 | 444 |
|
470 | 445 | <p class=MsoNormal><span style='font-size:14.0pt;line-height:115%'>Figure 12</span></p> |
471 | 446 |
|
|
481 | 456 | <p class=MsoNormal><span style='font-size:14.0pt;line-height:115%'> </span></p> |
482 | 457 |
|
483 | 458 | <p class=MsoNormal><img border=0 width=1275 height=696 id="Picture 2" |
484 | | -src="doc_files/image013.jpg"></p> |
| 459 | +src="doc_files/image015.jpg"></p> |
485 | 460 |
|
486 | 461 | <p class=MsoNormal><span style='font-size:14.0pt;line-height:115%'>Figure 13</span></p> |
487 | 462 |
|
|
496 | 471 | <p class=MsoNormal><span style='font-size:14.0pt;line-height:115%'> </span></p> |
497 | 472 |
|
498 | 473 | <p class=MsoNormal><img border=0 width=1327 height=606 id="Picture 3" |
499 | | -src="doc_files/image014.jpg"></p> |
| 474 | +src="doc_files/image016.jpg"></p> |
500 | 475 |
|
501 | 476 | <p class=MsoNormal><span style='font-size:14.0pt;line-height:115%'>Figure 14</span></p> |
502 | 477 |
|
|
510 | 485 | <p class=MsoNormal><span style='font-size:14.0pt;line-height:115%'> </span></p> |
511 | 486 |
|
512 | 487 | <p class=MsoNormal><img border=0 width=1284 height=501 id="Picture 4" |
513 | | -src="doc_files/image015.png"></p> |
| 488 | +src="doc_files/image017.png"></p> |
514 | 489 |
|
515 | 490 | <p class=MsoNormal><span style='font-size:14.0pt;line-height:115%'>Figure 15</span></p> |
516 | 491 |
|
|
521 | 496 | <p class=MsoNormal><span style='font-size:14.0pt;line-height:115%'> </span></p> |
522 | 497 |
|
523 | 498 | <p class=MsoNormal><img border=0 width=1249 height=806 id="Picture 5" |
524 | | -src="doc_files/image016.png"></p> |
| 499 | +src="doc_files/image018.png"></p> |
525 | 500 |
|
526 | 501 | <p class=MsoNormal><span style='font-size:14.0pt;line-height:115%'>Figure 16</span></p> |
527 | 502 |
|
|
531 | 506 | <p class=MsoNormal><span style='font-size:14.0pt;line-height:115%'> </span></p> |
532 | 507 |
|
533 | 508 | <p class=MsoNormal><img border=0 width=1271 height=658 id="Picture 6" |
534 | | -src="doc_files/image017.jpg"></p> |
| 509 | +src="doc_files/image019.jpg"></p> |
535 | 510 |
|
536 | 511 | <p class=MsoNormal><span style='font-size:14.0pt;line-height:115%'>Figure 17</span></p> |
537 | 512 |
|
|
541 | 516 | <p class=MsoNormal><span style='font-size:14.0pt;line-height:115%'> </span></p> |
542 | 517 |
|
543 | 518 | <p class=MsoNormal><img border=0 width=1325 height=771 id="Picture 7" |
544 | | -src="doc_files/image018.png"></p> |
| 519 | +src="doc_files/image020.png"></p> |
545 | 520 |
|
546 | 521 | <p class=MsoNormal><span style='font-size:14.0pt;line-height:115%'>Figure 18</span></p> |
547 | 522 |
|
|
553 | 528 | <p class=MsoNormal><span style='font-size:14.0pt;line-height:115%'> </span></p> |
554 | 529 |
|
555 | 530 | <p class=MsoNormal><img border=0 width=1186 height=798 id="Picture 8" |
556 | | -src="doc_files/image019.png"></p> |
| 531 | +src="doc_files/image021.png"></p> |
557 | 532 |
|
558 | 533 | <p class=MsoNormal><span style='font-size:14.0pt;line-height:115%'>Figure 19</span></p> |
559 | 534 |
|
|
573 | 548 | <p class=MsoNormal><span style='font-size:14.0pt;line-height:115%'> </span></p> |
574 | 549 |
|
575 | 550 | <p class=MsoNormal><img border=0 width=1371 height=688 id="Picture 9" |
576 | | -src="doc_files/image020.jpg"></p> |
| 551 | +src="doc_files/image022.jpg"></p> |
577 | 552 |
|
578 | 553 | <p class=MsoNormal><span style='font-size:14.0pt;line-height:115%'>Figure 20</span></p> |
579 | 554 |
|
|
588 | 563 | <p class=MsoNormal><span style='font-size:14.0pt;line-height:115%'> </span></p> |
589 | 564 |
|
590 | 565 | <p class=MsoNormal><img border=0 width=1380 height=643 id="Picture 10" |
591 | | -src="doc_files/image021.jpg"></p> |
| 566 | +src="doc_files/image023.jpg"></p> |
592 | 567 |
|
593 | 568 | <p class=MsoNormal><span style='font-size:14.0pt;line-height:115%'>Figure 21</span></p> |
594 | 569 |
|
|
604 | 579 | <p class=MsoNormal><span style='font-size:14.0pt;line-height:115%'> </span></p> |
605 | 580 |
|
606 | 581 | <p class=MsoNormal><img border=0 width=1407 height=642 id="Picture 11" |
607 | | -src="doc_files/image022.jpg"></p> |
| 582 | +src="doc_files/image024.jpg"></p> |
608 | 583 |
|
609 | 584 | <p class=MsoNormal><span style='font-size:14.0pt;line-height:115%'>Figure 22</span></p> |
610 | 585 |
|
|
615 | 590 | <p class=MsoNormal><span style='font-size:14.0pt;line-height:115%'> </span></p> |
616 | 591 |
|
617 | 592 | <p class=MsoNormal><img border=0 width=1444 height=761 id="Picture 12" |
618 | | -src="doc_files/image023.png"></p> |
| 593 | +src="doc_files/image025.png"></p> |
619 | 594 |
|
620 | 595 | <p class=MsoNormal><span style='font-size:14.0pt;line-height:115%'>Figure 23</span></p> |
621 | 596 |
|
|
630 | 605 | <p class=MsoNormal><span style='font-size:14.0pt;line-height:115%'> </span></p> |
631 | 606 |
|
632 | 607 | <p class=MsoNormal><img border=0 width=1452 height=596 id="Picture 13" |
633 | | -src="doc_files/image024.jpg"></p> |
| 608 | +src="doc_files/image026.jpg"></p> |
634 | 609 |
|
635 | 610 | <p class=MsoNormal><span style='font-size:14.0pt;line-height:115%'>Figure 24</span></p> |
636 | 611 |
|
|
651 | 626 | <p class=MsoNormal><span style='font-size:14.0pt;line-height:115%'> </span></p> |
652 | 627 |
|
653 | 628 | <p class=MsoNormal><img border=0 width=1303 height=796 id="Picture 14" |
654 | | -src="doc_files/image025.png"></p> |
| 629 | +src="doc_files/image027.png"></p> |
655 | 630 |
|
656 | 631 | <p class=MsoNormal><span style='font-size:14.0pt;line-height:115%'>Figure 25</span></p> |
657 | 632 |
|
|
661 | 636 | <p class=MsoNormal><span style='font-size:14.0pt;line-height:115%'> </span></p> |
662 | 637 |
|
663 | 638 | <p class=MsoNormal><img border=0 width=1385 height=736 id="Picture 15" |
664 | | -src="doc_files/image026.png"></p> |
| 639 | +src="doc_files/image028.png"></p> |
665 | 640 |
|
666 | 641 | <p class=MsoNormal><span style='font-size:14.0pt;line-height:115%'>Figure 26</span></p> |
667 | 642 |
|
|
672 | 647 | <p class=MsoNormal><span style='font-size:14.0pt;line-height:115%'> </span></p> |
673 | 648 |
|
674 | 649 | <p class=MsoNormal><img border=0 width=1423 height=667 id="Picture 16" |
675 | | -src="doc_files/image027.png"></p> |
| 650 | +src="doc_files/image029.png"></p> |
676 | 651 |
|
677 | 652 | <p class=MsoNormal><span style='font-size:14.0pt;line-height:115%'>Figure 27</span></p> |
678 | 653 |
|
|
682 | 657 | <p class=MsoNormal><span style='font-size:14.0pt;line-height:115%'> </span></p> |
683 | 658 |
|
684 | 659 | <p class=MsoNormal><img border=0 width=1237 height=836 id="Picture 17" |
685 | | -src="doc_files/image028.png"></p> |
| 660 | +src="doc_files/image030.png"></p> |
686 | 661 |
|
687 | 662 | <p class=MsoNormal><span style='font-size:14.0pt;line-height:115%'>Figure 28</span></p> |
688 | 663 |
|
|
0 commit comments