Skip to content

Commit 7d98126

Browse files
committed
Deployed 16048b3 with MkDocs version: 1.6.1
1 parent 19abf64 commit 7d98126

File tree

4 files changed

+40
-41
lines changed

4 files changed

+40
-41
lines changed

features/custom-models/index.html

Lines changed: 24 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -76,7 +76,7 @@
7676
<div data-md-component="skip">
7777

7878

79-
<a href="#run-custom-langchain-openai-model" class="md-skip">
79+
<a href="#run-custom-model" class="md-skip">
8080
Skip to content
8181
</a>
8282

@@ -435,9 +435,9 @@
435435
<ul class="md-nav__list" data-md-component="toc" data-md-scrollfix>
436436

437437
<li class="md-nav__item">
438-
<a href="#run-custom-langchain-openai-model" class="md-nav__link">
438+
<a href="#run-custom-model" class="md-nav__link">
439439
<span class="md-ellipsis">
440-
Run custom langchain OpenAI model
440+
Run custom model
441441
</span>
442442
</a>
443443

@@ -732,9 +732,9 @@
732732
<ul class="md-nav__list" data-md-component="toc" data-md-scrollfix>
733733

734734
<li class="md-nav__item">
735-
<a href="#run-custom-langchain-openai-model" class="md-nav__link">
735+
<a href="#run-custom-model" class="md-nav__link">
736736
<span class="md-ellipsis">
737-
Run custom langchain OpenAI model
737+
Run custom model
738738
</span>
739739
</a>
740740

@@ -778,28 +778,27 @@
778778

779779
<h1>Custom models</h1>
780780

781-
<p>Note that small local models tend to trim long outputs and could require more careful tuning of data description. </p>
782-
<h2 id="run-custom-langchain-openai-model">Run custom langchain OpenAI model</h2>
783-
<p>You can instantiate <code>Parsera</code> with any chat model supported by LangChain, for example, to run the model from Azure:<br />
781+
<p>All custom models are run with <a href="/features/extractors/#chunks-tabular-extractor"><code>ChunksTabularExtractor</code></a>,
782+
if you want custom extractor you need to initialize it with model of your choice.</p>
783+
<p>Note that small local models tend to trim long outputs and could require more careful tuning of data description.</p>
784+
<h2 id="run-custom-model">Run custom model</h2>
785+
<p>You can instantiate <code>Parsera</code> with any chat model supported by LangChain, for example, to run <code>gpt-4o-mini</code> from OpenAI API:<br />
784786
<div class="language-python highlight"><pre><span></span><code><span id="__span-0-1"><a id="__codelineno-0-1" name="__codelineno-0-1" href="#__codelineno-0-1"></a><span class="kn">import</span><span class="w"> </span><span class="nn">os</span>
785-
</span><span id="__span-0-2"><a id="__codelineno-0-2" name="__codelineno-0-2" href="#__codelineno-0-2"></a><span class="kn">from</span><span class="w"> </span><span class="nn">langchain_openai</span><span class="w"> </span><span class="kn">import</span> <span class="n">AzureChatOpenAI</span>
787+
</span><span id="__span-0-2"><a id="__codelineno-0-2" name="__codelineno-0-2" href="#__codelineno-0-2"></a><span class="kn">from</span><span class="w"> </span><span class="nn">langchain_openai</span><span class="w"> </span><span class="kn">import</span> <span class="n">ChatOpenAI</span>
786788
</span><span id="__span-0-3"><a id="__codelineno-0-3" name="__codelineno-0-3" href="#__codelineno-0-3"></a>
787-
</span><span id="__span-0-4"><a id="__codelineno-0-4" name="__codelineno-0-4" href="#__codelineno-0-4"></a><span class="n">llm</span> <span class="o">=</span> <span class="n">AzureChatOpenAI</span><span class="p">(</span>
788-
</span><span id="__span-0-5"><a id="__codelineno-0-5" name="__codelineno-0-5" href="#__codelineno-0-5"></a> <span class="n">azure_endpoint</span><span class="o">=</span><span class="n">os</span><span class="o">.</span><span class="n">getenv</span><span class="p">(</span><span class="s2">&quot;AZURE_GPT_BASE_URL&quot;</span><span class="p">),</span>
789-
</span><span id="__span-0-6"><a id="__codelineno-0-6" name="__codelineno-0-6" href="#__codelineno-0-6"></a> <span class="n">openai_api_version</span><span class="o">=</span><span class="s2">&quot;2023-05-15&quot;</span><span class="p">,</span>
790-
</span><span id="__span-0-7"><a id="__codelineno-0-7" name="__codelineno-0-7" href="#__codelineno-0-7"></a> <span class="n">deployment_name</span><span class="o">=</span><span class="n">os</span><span class="o">.</span><span class="n">getenv</span><span class="p">(</span><span class="s2">&quot;AZURE_GPT_DEPLOYMENT_NAME&quot;</span><span class="p">),</span>
791-
</span><span id="__span-0-8"><a id="__codelineno-0-8" name="__codelineno-0-8" href="#__codelineno-0-8"></a> <span class="n">openai_api_key</span><span class="o">=</span><span class="n">os</span><span class="o">.</span><span class="n">getenv</span><span class="p">(</span><span class="s2">&quot;AZURE_GPT_API_KEY&quot;</span><span class="p">),</span>
792-
</span><span id="__span-0-9"><a id="__codelineno-0-9" name="__codelineno-0-9" href="#__codelineno-0-9"></a> <span class="n">openai_api_type</span><span class="o">=</span><span class="s2">&quot;azure&quot;</span><span class="p">,</span>
793-
</span><span id="__span-0-10"><a id="__codelineno-0-10" name="__codelineno-0-10" href="#__codelineno-0-10"></a> <span class="n">temperature</span><span class="o">=</span><span class="mf">0.0</span><span class="p">,</span>
794-
</span><span id="__span-0-11"><a id="__codelineno-0-11" name="__codelineno-0-11" href="#__codelineno-0-11"></a><span class="p">)</span>
795-
</span><span id="__span-0-12"><a id="__codelineno-0-12" name="__codelineno-0-12" href="#__codelineno-0-12"></a>
796-
</span><span id="__span-0-13"><a id="__codelineno-0-13" name="__codelineno-0-13" href="#__codelineno-0-13"></a><span class="n">url</span> <span class="o">=</span> <span class="s2">&quot;https://github.com/raznem/parsera&quot;</span>
797-
</span><span id="__span-0-14"><a id="__codelineno-0-14" name="__codelineno-0-14" href="#__codelineno-0-14"></a><span class="n">elements</span> <span class="o">=</span> <span class="p">{</span>
798-
</span><span id="__span-0-15"><a id="__codelineno-0-15" name="__codelineno-0-15" href="#__codelineno-0-15"></a> <span class="s2">&quot;Stars&quot;</span><span class="p">:</span> <span class="s2">&quot;Number of stars&quot;</span><span class="p">,</span>
799-
</span><span id="__span-0-16"><a id="__codelineno-0-16" name="__codelineno-0-16" href="#__codelineno-0-16"></a> <span class="s2">&quot;Fork&quot;</span><span class="p">:</span> <span class="s2">&quot;Number of forks&quot;</span><span class="p">,</span>
800-
</span><span id="__span-0-17"><a id="__codelineno-0-17" name="__codelineno-0-17" href="#__codelineno-0-17"></a><span class="p">}</span>
801-
</span><span id="__span-0-18"><a id="__codelineno-0-18" name="__codelineno-0-18" href="#__codelineno-0-18"></a><span class="n">scrapper</span> <span class="o">=</span> <span class="n">Parsera</span><span class="p">(</span><span class="n">model</span><span class="o">=</span><span class="n">llm</span><span class="p">)</span>
802-
</span><span id="__span-0-19"><a id="__codelineno-0-19" name="__codelineno-0-19" href="#__codelineno-0-19"></a><span class="n">result</span> <span class="o">=</span> <span class="n">scrapper</span><span class="o">.</span><span class="n">run</span><span class="p">(</span><span class="n">url</span><span class="o">=</span><span class="n">url</span><span class="p">,</span> <span class="n">elements</span><span class="o">=</span><span class="n">elements</span><span class="p">)</span>
789+
</span><span id="__span-0-4"><a id="__codelineno-0-4" name="__codelineno-0-4" href="#__codelineno-0-4"></a><span class="n">llm</span> <span class="o">=</span> <span class="n">ChatOpenAI</span><span class="p">(</span>
790+
</span><span id="__span-0-5"><a id="__codelineno-0-5" name="__codelineno-0-5" href="#__codelineno-0-5"></a> <span class="n">model</span><span class="o">=</span><span class="s2">&quot;gpt-4o-mini&quot;</span><span class="p">,</span>
791+
</span><span id="__span-0-6"><a id="__codelineno-0-6" name="__codelineno-0-6" href="#__codelineno-0-6"></a> <span class="n">temperature</span><span class="o">=</span><span class="mf">0.0</span><span class="p">,</span>
792+
</span><span id="__span-0-7"><a id="__codelineno-0-7" name="__codelineno-0-7" href="#__codelineno-0-7"></a> <span class="n">timeout</span><span class="o">=</span><span class="mi">120</span><span class="p">,</span>
793+
</span><span id="__span-0-8"><a id="__codelineno-0-8" name="__codelineno-0-8" href="#__codelineno-0-8"></a><span class="p">)</span>
794+
</span><span id="__span-0-9"><a id="__codelineno-0-9" name="__codelineno-0-9" href="#__codelineno-0-9"></a>
795+
</span><span id="__span-0-10"><a id="__codelineno-0-10" name="__codelineno-0-10" href="#__codelineno-0-10"></a><span class="n">url</span> <span class="o">=</span> <span class="s2">&quot;https://github.com/raznem/parsera&quot;</span>
796+
</span><span id="__span-0-11"><a id="__codelineno-0-11" name="__codelineno-0-11" href="#__codelineno-0-11"></a><span class="n">elements</span> <span class="o">=</span> <span class="p">{</span>
797+
</span><span id="__span-0-12"><a id="__codelineno-0-12" name="__codelineno-0-12" href="#__codelineno-0-12"></a> <span class="s2">&quot;Stars&quot;</span><span class="p">:</span> <span class="s2">&quot;Number of stars&quot;</span><span class="p">,</span>
798+
</span><span id="__span-0-13"><a id="__codelineno-0-13" name="__codelineno-0-13" href="#__codelineno-0-13"></a> <span class="s2">&quot;Fork&quot;</span><span class="p">:</span> <span class="s2">&quot;Number of forks&quot;</span><span class="p">,</span>
799+
</span><span id="__span-0-14"><a id="__codelineno-0-14" name="__codelineno-0-14" href="#__codelineno-0-14"></a><span class="p">}</span>
800+
</span><span id="__span-0-15"><a id="__codelineno-0-15" name="__codelineno-0-15" href="#__codelineno-0-15"></a><span class="n">scrapper</span> <span class="o">=</span> <span class="n">Parsera</span><span class="p">(</span><span class="n">model</span><span class="o">=</span><span class="n">llm</span><span class="p">)</span>
801+
</span><span id="__span-0-16"><a id="__codelineno-0-16" name="__codelineno-0-16" href="#__codelineno-0-16"></a><span class="n">result</span> <span class="o">=</span> <span class="n">scrapper</span><span class="o">.</span><span class="n">run</span><span class="p">(</span><span class="n">url</span><span class="o">=</span><span class="n">url</span><span class="p">,</span> <span class="n">elements</span><span class="o">=</span><span class="n">elements</span><span class="p">)</span>
803802
</span></code></pre></div></p>
804803
<h2 id="run-local-model-with-ollama">Run local model with <code>Ollama</code></h2>
805804
<p>First, you should install and run <code>ollama</code> in your local environment: <a href="https://github.com/ollama/ollama?tab=readme-ov-file#ollama">official installation guide</a>.

search/search_index.json

Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

sitemap.xml

Lines changed: 15 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -2,62 +2,62 @@
22
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
33
<url>
44
<loc>https://docs.parsera.org/</loc>
5-
<lastmod>2025-01-22</lastmod>
5+
<lastmod>2025-01-27</lastmod>
66
</url>
77
<url>
88
<loc>https://docs.parsera.org/contributing/</loc>
9-
<lastmod>2025-01-22</lastmod>
9+
<lastmod>2025-01-27</lastmod>
1010
</url>
1111
<url>
1212
<loc>https://docs.parsera.org/getting-started/</loc>
13-
<lastmod>2025-01-22</lastmod>
13+
<lastmod>2025-01-27</lastmod>
1414
</url>
1515
<url>
1616
<loc>https://docs.parsera.org/api/cookies/</loc>
17-
<lastmod>2025-01-22</lastmod>
17+
<lastmod>2025-01-27</lastmod>
1818
</url>
1919
<url>
2020
<loc>https://docs.parsera.org/api/getting-started/</loc>
21-
<lastmod>2025-01-22</lastmod>
21+
<lastmod>2025-01-27</lastmod>
2222
</url>
2323
<url>
2424
<loc>https://docs.parsera.org/api/precision-mode/</loc>
25-
<lastmod>2025-01-22</lastmod>
25+
<lastmod>2025-01-27</lastmod>
2626
</url>
2727
<url>
2828
<loc>https://docs.parsera.org/api/proxy/</loc>
29-
<lastmod>2025-01-22</lastmod>
29+
<lastmod>2025-01-27</lastmod>
3030
</url>
3131
<url>
3232
<loc>https://docs.parsera.org/features/custom-browser/</loc>
33-
<lastmod>2025-01-22</lastmod>
33+
<lastmod>2025-01-27</lastmod>
3434
</url>
3535
<url>
3636
<loc>https://docs.parsera.org/features/custom-cookies/</loc>
37-
<lastmod>2025-01-22</lastmod>
37+
<lastmod>2025-01-27</lastmod>
3838
</url>
3939
<url>
4040
<loc>https://docs.parsera.org/features/custom-models/</loc>
41-
<lastmod>2025-01-22</lastmod>
41+
<lastmod>2025-01-27</lastmod>
4242
</url>
4343
<url>
4444
<loc>https://docs.parsera.org/features/custom-playwright/</loc>
45-
<lastmod>2025-01-22</lastmod>
45+
<lastmod>2025-01-27</lastmod>
4646
</url>
4747
<url>
4848
<loc>https://docs.parsera.org/features/docker/</loc>
49-
<lastmod>2025-01-22</lastmod>
49+
<lastmod>2025-01-27</lastmod>
5050
</url>
5151
<url>
5252
<loc>https://docs.parsera.org/features/extractors/</loc>
53-
<lastmod>2025-01-22</lastmod>
53+
<lastmod>2025-01-27</lastmod>
5454
</url>
5555
<url>
5656
<loc>https://docs.parsera.org/features/proxy/</loc>
57-
<lastmod>2025-01-22</lastmod>
57+
<lastmod>2025-01-27</lastmod>
5858
</url>
5959
<url>
6060
<loc>https://docs.parsera.org/features/scrolling/</loc>
61-
<lastmod>2025-01-22</lastmod>
61+
<lastmod>2025-01-27</lastmod>
6262
</url>
6363
</urlset>

sitemap.xml.gz

1 Byte
Binary file not shown.

0 commit comments

Comments
 (0)