Skip to content

Commit 3be8c0a

Browse files
Added the changes
1 parent 4748199 commit 3be8c0a

17 files changed

Lines changed: 261 additions & 0 deletions

File tree

Binary file not shown.
Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
# Create Word document using C#
2+
3+
The Syncfusion® Smart Data Extractor is a .NET library used to extract structured data and document elements from PDFs and images in Console Application.
4+
5+
## Steps to Extract Data from PDF in Console App
6+
7+
Step 1: Create a new .NET Core console application project.
8+
9+
Step 2: Install the [Syncfusion.SmartDataExtractor.Net.Core](https://www.nuget.org/packages/Syncfusion.SmartDataExtractor.Net.Core) NuGet package as a reference to your project from [NuGet.org](https://www.nuget.org/).
10+
11+
Step 3: Include the following namespaces in the Program.cs file.
12+
13+
```csharp
14+
using System.IO;
15+
using System.Text;
16+
using Syncfusion.SmartDataExtractor;
17+
18+
```
19+
20+
Step 4: Add the following code snippet in Program.cs file to extract data from PDF.
21+
22+
```csharp
23+
// Open the input PDF file as a stream.
24+
using (FileStream stream = new FileStream(Path.GetFullPath("Input.pdf"), FileMode.Open, FileAccess.ReadWrite))
25+
{
26+
// Initialize the Smart Data Extractor.
27+
DataExtractor extractor = new DataExtractor();
28+
// Extract form data as JSON.
29+
string data = extractor.ExtractDataAsJson(stream);
30+
// Save the extracted JSON data into an output file.
31+
File.WriteAllText(Path.GetFullPath(@"Output.json"), data, Encoding.UTF8);
32+
}
33+
34+
```
35+
36+
More information about Extract data from PDF can be refer in this [documentation](https://help.syncfusion.com/document-processing/data-extraction/smart-data-extractor/overview)section.
Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
2+
Microsoft Visual Studio Solution File, Format Version 12.00
3+
# Visual Studio Version 18
4+
VisualStudioVersion = 18.5.11716.220 stable
5+
MinimumVisualStudioVersion = 10.0.40219.1
6+
Project("{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}") = "Extract-data-as-md-from-PDF", "Extract-data-as-md-from-PDF\Extract-data-as-md-from-PDF.csproj", "{29872D0F-18F6-6AB7-0892-D538C0E179BA}"
7+
EndProject
8+
Global
9+
GlobalSection(SolutionConfigurationPlatforms) = preSolution
10+
Debug|Any CPU = Debug|Any CPU
11+
Release|Any CPU = Release|Any CPU
12+
EndGlobalSection
13+
GlobalSection(ProjectConfigurationPlatforms) = postSolution
14+
{29872D0F-18F6-6AB7-0892-D538C0E179BA}.Debug|Any CPU.ActiveCfg = Debug|Any CPU
15+
{29872D0F-18F6-6AB7-0892-D538C0E179BA}.Debug|Any CPU.Build.0 = Debug|Any CPU
16+
{29872D0F-18F6-6AB7-0892-D538C0E179BA}.Release|Any CPU.ActiveCfg = Release|Any CPU
17+
{29872D0F-18F6-6AB7-0892-D538C0E179BA}.Release|Any CPU.Build.0 = Release|Any CPU
18+
EndGlobalSection
19+
GlobalSection(SolutionProperties) = preSolution
20+
HideSolutionNode = FALSE
21+
EndGlobalSection
22+
EndGlobal
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
<Project Sdk="Microsoft.NET.Sdk">
2+
3+
<PropertyGroup>
4+
<OutputType>Exe</OutputType>
5+
<TargetFramework>net8.0</TargetFramework>
6+
<RootNamespace>Extract_data_as_md_from_PDF</RootNamespace>
7+
<ImplicitUsings>enable</ImplicitUsings>
8+
<Nullable>enable</Nullable>
9+
</PropertyGroup>
10+
11+
<ItemGroup>
12+
<PackageReference Include="Syncfusion.SmartDataExtractor.Net.Core" Version="*" />
13+
</ItemGroup>
14+
15+
<ItemGroup>
16+
<None Update="Data\Input.pdf">
17+
<CopyToOutputDirectory>Always</CopyToOutputDirectory>
18+
</None>
19+
<None Update="Output\.gitkeep">
20+
<CopyToOutputDirectory>Always</CopyToOutputDirectory>
21+
</None>
22+
</ItemGroup>
23+
24+
</Project>

Data-Extraction/Smart-Data-Extractor/Extract-data-as-md-from-PDF/.NET/Extract-data-as-md-from-PDF/Output/.gitkeep

Whitespace-only changes.
Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
using Syncfusion.SmartDataExtractor;
2+
using System.Text;
3+
4+
//Open the input PDF file as a stream.
5+
using (FileStream stream = new FileStream(Path.GetFullPath(@"Data\Input.pdf"), FileMode.Open, FileAccess.ReadWrite))
6+
{
7+
//Initialize the Smart Data Extractor.
8+
DataExtractor extractor = new DataExtractor();
9+
//Extract data as Markdown.
10+
string data = extractor.ExtractDataAsMarkdown(stream);
11+
//Save the extracted Markdown data into an output file.
12+
File.WriteAllText(Path.GetFullPath(@"Output\Output.md"), data, Encoding.UTF8);
13+
}
Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
# Create Word document using C#
2+
3+
The Syncfusion® Smart Data Extractor is a .NET library used to extract structured data and document elements from PDFs and images in Console Application.
4+
5+
## Steps to Extract Data from PDF in Console App
6+
7+
Step 1: Create a new .NET Core console application project.
8+
9+
Step 2: Install the [Syncfusion.SmartDataExtractor.Net.Core](https://www.nuget.org/packages/Syncfusion.SmartDataExtractor.Net.Core) NuGet package as a reference to your project from [NuGet.org](https://www.nuget.org/).
10+
11+
Step 3: Include the following namespaces in the Program.cs file.
12+
13+
```csharp
14+
using System.IO;
15+
using System.Text;
16+
using Syncfusion.SmartDataExtractor;
17+
18+
```
19+
20+
Step 4: Add the following code snippet in Program.cs file to extract data from PDF.
21+
22+
```csharp
23+
//Open the input PDF file as a stream.
24+
using (FileStream stream = new FileStream("Input.pdf", FileMode.Open, FileAccess.Read))
25+
{
26+
//Initialize the Smart Data Extractor.
27+
DataExtractor extractor = new DataExtractor();
28+
//Extract data as Markdown.
29+
string data = extractor.ExtractDataAsMarkdown(stream);
30+
//Save the extracted Markdown data into an output file.
31+
File.WriteAllText("Output.md", data, Encoding.UTF8);
32+
}
33+
34+
```
35+
36+
More information about Extract data from PDF can be refer in this [documentation](https://help.syncfusion.com/document-processing/data-extraction/smart-data-extractor/overview)section.
Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
# Create Word document using C#
2+
3+
The Syncfusion® Smart Form Recognizer is a .NET C# library that detects and extracts structured form data from PDFs and images in Console Application.
4+
5+
## Steps to Extract Data from PDF in Console App
6+
7+
Step 1: Create a new .NET Core console application project.
8+
9+
Step 2: Install the [Syncfusion.SmartFormRecognizer.Net.Core](https://www.nuget.org/packages/Syncfusion.SmartFormRecognizer.Net.Core) NuGet package as a reference to your project from [NuGet.org](https://www.nuget.org/).
10+
11+
Step 3: Include the following namespaces in the Program.cs file.
12+
13+
```csharp
14+
using System.IO;
15+
using Syncfusion.SmartFormRecognizer;
16+
```
17+
18+
Step 4: Add the following code snippet in Program.cs file to extract data from PDF.
19+
20+
```csharp
21+
// Read the input PDF file as stream.
22+
using (FileStream inputStream = new FileStream(Path.GetFullPath(@"Input.pdf"), FileMode.Open, FileAccess.ReadWrite))
23+
{
24+
// Initialize the Form Recognizer.
25+
FormRecognizer smartFormRecognizer = new FormRecognizer();
26+
// Recognize the form and get the output as JSON string.
27+
string outputJson = smartFormRecognizer.RecognizeFormAsJson(inputStream);
28+
// Save the output JSON to file.
29+
File.WriteAllText(Path.GetFullPath(@"Output.json"),outputJson);
30+
}
31+
```
32+
33+
More information about SmartFormRecognizer can be refer in this [documentation](https://help.syncfusion.com/document-processing/data-extraction/smart-form-recognizer/overview)section.

0 commit comments

Comments
 (0)