|
19 | 19 | "id": "JATmSI8mcyW2" |
20 | 20 | }, |
21 | 21 | "source": [ |
22 | | - "In this recipe, we’ll demonstrate how to fine-tune [IBM's Granite Vision 3.1 2B Model](https://huggingface.co/ibm-granite/granite-vision-3.1-2b-preview).\n", |
| 22 | + "This recipe will enable you to fine-tune [IBM's Granite Vision 3.1 2B Model](https://huggingface.co/ibm-granite/granite-vision-3.1-2b-preview).\n", |
23 | 23 | "It is a lightweight yet capable model trained by fine-tuning a [Granite language model](https://huggingface.co/ibm-granite/granite-3.1-2b-instruct) with both image and text modalities.\n", |
24 | 24 | "We will be using the Hugging Face ecosystem, leveraging the powerful [Transformer Reinforcement Learning library (TRL)](https://huggingface.co/docs/trl/index). This step-by-step guide will enable you to Granite Vision for your specific tasks, even on consumer GPUs.\n", |
25 | 25 | "\n", |
|
44 | 44 | }, |
45 | 45 | { |
46 | 46 | "cell_type": "code", |
47 | | - "execution_count": 1, |
| 47 | + "execution_count": null, |
48 | 48 | "metadata": { |
49 | 49 | "id": "GCMhPmFdIGSb" |
50 | 50 | }, |
|
57 | 57 | }, |
58 | 58 | { |
59 | 59 | "cell_type": "code", |
60 | | - "execution_count": 2, |
| 60 | + "execution_count": null, |
61 | 61 | "metadata": { |
62 | 62 | "colab": { |
63 | 63 | "base_uri": "https://localhost:8080/" |
|
297 | 297 | "source": [ |
298 | 298 | "## 3. Load Model and Check Performance! 🤔\n", |
299 | 299 | "\n", |
300 | | - "Now that we’ve loaded the dataset, it’s time to load the [IBM's Granite Vision Model](https://huggingface.co/ibm-granite/granite-vision-3.1-2b-preview), a 3\n", |
301 | | - "2B parameter Vision Language Model (VLM) built on that offers state-of-the-art (SOTA) performance while being efficient in terms of memory usage.\n", |
| 300 | + "Now that we’ve loaded the dataset, it’s time to load the [IBM's Granite Vision Model](https://huggingface.co/ibm-granite/granite-vision-3.1-2b-preview), a 2B parameter Vision Language Model (VLM) built on that offers state-of-the-art (SOTA) performance while being efficient in terms of memory usage.\n", |
302 | 301 | "\n", |
303 | 302 | "For a broader comparison of state-of-the-art VLMs, explore the [WildVision Arena](https://huggingface.co/spaces/WildVision/vision-arena) and the [OpenVLM Leaderboard](https://huggingface.co/spaces/opencompass/open_vlm_leaderboard), where you can find the best-performing models across various benchmarks.\n" |
304 | 303 | ] |
305 | 304 | }, |
306 | 305 | { |
307 | 306 | "cell_type": "code", |
308 | | - "execution_count": 9, |
| 307 | + "execution_count": null, |
309 | 308 | "metadata": { |
310 | 309 | "id": "PCJhM6tCw4lq" |
311 | 310 | }, |
|
328 | 327 | }, |
329 | 328 | { |
330 | 329 | "cell_type": "code", |
331 | | - "execution_count": 10, |
| 330 | + "execution_count": null, |
332 | 331 | "metadata": { |
333 | 332 | "collapsed": true, |
334 | 333 | "id": "awtjIq86JfFF", |
|
384 | 383 | }, |
385 | 384 | { |
386 | 385 | "cell_type": "code", |
387 | | - "execution_count": 11, |
| 386 | + "execution_count": null, |
388 | 387 | "metadata": { |
389 | 388 | "id": "i-eIIdL9lqJJ" |
390 | 389 | }, |
|
424 | 423 | }, |
425 | 424 | { |
426 | 425 | "cell_type": "code", |
427 | | - "execution_count": 12, |
| 426 | + "execution_count": null, |
428 | 427 | "metadata": { |
429 | 428 | "id": "QavnLzjJUbxf" |
430 | 429 | }, |
|
456 | 455 | }, |
457 | 456 | { |
458 | 457 | "cell_type": "code", |
459 | | - "execution_count": 13, |
| 458 | + "execution_count": null, |
460 | 459 | "metadata": { |
461 | 460 | "id": "_MoRTjFcE8qD" |
462 | 461 | }, |
|
503 | 502 | }, |
504 | 503 | { |
505 | 504 | "cell_type": "code", |
506 | | - "execution_count": 14, |
| 505 | + "execution_count": null, |
507 | 506 | "metadata": { |
508 | 507 | "colab": { |
509 | 508 | "base_uri": "https://localhost:8080/", |
|
552 | 551 | }, |
553 | 552 | { |
554 | 553 | "cell_type": "code", |
555 | | - "execution_count": 15, |
| 554 | + "execution_count": null, |
556 | 555 | "metadata": { |
557 | 556 | "id": "dxkXZuUkvy8j" |
558 | 557 | }, |
|
617 | 616 | }, |
618 | 617 | { |
619 | 618 | "cell_type": "code", |
620 | | - "execution_count": 16, |
| 619 | + "execution_count": null, |
621 | 620 | "metadata": { |
622 | 621 | "id": "zm_bJRrXsESg" |
623 | 622 | }, |
|
680 | 679 | }, |
681 | 680 | { |
682 | 681 | "cell_type": "code", |
683 | | - "execution_count": 17, |
| 682 | + "execution_count": null, |
684 | 683 | "metadata": { |
685 | 684 | "id": "ITmkRHWCKYjf" |
686 | 685 | }, |
|
724 | 723 | }, |
725 | 724 | { |
726 | 725 | "cell_type": "code", |
727 | | - "execution_count": 18, |
| 726 | + "execution_count": null, |
728 | 727 | "metadata": { |
729 | 728 | "id": "SbqX1pQUKaSM" |
730 | 729 | }, |
|
779 | 778 | }, |
780 | 779 | { |
781 | 780 | "cell_type": "code", |
782 | | - "execution_count": 19, |
| 781 | + "execution_count": null, |
783 | 782 | "metadata": { |
784 | 783 | "id": "pAzDovzylQeZ" |
785 | 784 | }, |
|
859 | 858 | }, |
860 | 859 | { |
861 | 860 | "cell_type": "code", |
862 | | - "execution_count": 21, |
| 861 | + "execution_count": null, |
863 | 862 | "metadata": { |
864 | 863 | "id": "p1rgMTBDLboO" |
865 | 864 | }, |
|
940 | 939 | }, |
941 | 940 | { |
942 | 941 | "cell_type": "code", |
943 | | - "execution_count": 22, |
| 942 | + "execution_count": null, |
944 | 943 | "metadata": { |
945 | 944 | "id": "tE8usZw0lgrL" |
946 | 945 | }, |
|
971 | 970 | }, |
972 | 971 | { |
973 | 972 | "cell_type": "code", |
974 | | - "execution_count": 23, |
| 973 | + "execution_count": null, |
975 | 974 | "metadata": { |
976 | 975 | "colab": { |
977 | 976 | "base_uri": "https://localhost:8080/" |
|
1004 | 1003 | }, |
1005 | 1004 | { |
1006 | 1005 | "cell_type": "code", |
1007 | | - "execution_count": 24, |
| 1006 | + "execution_count": null, |
1008 | 1007 | "metadata": { |
1009 | 1008 | "id": "EFqTNUud2lA7" |
1010 | 1009 | }, |
|
1046 | 1045 | }, |
1047 | 1046 | { |
1048 | 1047 | "cell_type": "code", |
1049 | | - "execution_count": 25, |
| 1048 | + "execution_count": null, |
1050 | 1049 | "metadata": { |
1051 | 1050 | "id": "mQi2xBXk4sHe" |
1052 | 1051 | }, |
|
1067 | 1066 | }, |
1068 | 1067 | { |
1069 | 1068 | "cell_type": "code", |
1070 | | - "execution_count": 26, |
| 1069 | + "execution_count": null, |
1071 | 1070 | "metadata": { |
1072 | 1071 | "colab": { |
1073 | 1072 | "base_uri": "https://localhost:8080/" |
|
1099 | 1098 | }, |
1100 | 1099 | { |
1101 | 1100 | "cell_type": "code", |
1102 | | - "execution_count": 27, |
| 1101 | + "execution_count": null, |
1103 | 1102 | "metadata": { |
1104 | 1103 | "id": "ATuQ6ZS6eirO" |
1105 | 1104 | }, |
|
1122 | 1121 | }, |
1123 | 1122 | { |
1124 | 1123 | "cell_type": "code", |
1125 | | - "execution_count": 28, |
| 1124 | + "execution_count": null, |
1126 | 1125 | "metadata": { |
1127 | 1126 | "colab": { |
1128 | 1127 | "base_uri": "https://localhost:8080/", |
|
0 commit comments