Skip to content

Commit 4c3008f

Browse files
committed
Add detailed sitemap and robots.txt support
Introduced the SitemapPage model for richer sitemap metadata and added methods to retrieve detailed sitemap page information. Implemented dynamic robots.txt generation based on configuration. Updated README with comprehensive usage, configuration, and advanced examples. Cleaned up StringHelper and refactored WebsiteDiscoveryProvider for clarity and new features.
1 parent 8ccf7e3 commit 4c3008f

File tree

7 files changed

+434
-117
lines changed

7 files changed

+434
-117
lines changed

Directory.Build.props

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@
99
<PackageReleaseNotes>https://github.com/vhugogarcia/xperience-community-seo</PackageReleaseNotes>
1010
<PackageIcon>logo.png</PackageIcon>
1111
<PackageReadmeFile>README.md</PackageReadmeFile>
12-
<PackageTags>kentico;xperience;seo;ai</PackageTags>
12+
<PackageTags>kentico;xperience;seo;ai;llms;sitemap;robotstxt</PackageTags>
1313
</PropertyGroup>
1414

1515
<ItemGroup>

README.md

Lines changed: 321 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,322 @@
11
# xperience-community-seo
2-
a centralized repository dedicated to essential SEO infrastructure files like robots.txt, sitemap.xml, llms.txt, and more. It aims to provide optimized configuration files that enhance search engine crawling, indexing, and visibility for websites and AI-driven search models.
2+
3+
A centralized repository dedicated to essential SEO infrastructure files like robots.txt, sitemap.xml, llms.txt, and more. It aims to provide optimized configuration files that enhance search engine crawling, indexing, and visibility for websites and AI-driven search models.
4+
5+
## Features
6+
7+
- **Configurable Sitemap Endpoint**: Generate XML sitemaps with a customizable URL path
8+
- **Dynamic Content Discovery**: Automatically discover and include content items based on your configuration
9+
- **Cache Optimization**: Built-in caching using Kentico's cache dependency system
10+
- **Flexible Configuration**: Configure which content types, fields, and languages to include
11+
12+
## Quick Start
13+
14+
### Installation
15+
16+
Install the NuGet package:
17+
18+
```bash
19+
dotnet add package XperienceCommunity.SEO
20+
```
21+
22+
### Configuration
23+
24+
Register the SEO services in your `Program.cs`:
25+
26+
```csharp
27+
using XperienceCommunity.SEO;
28+
29+
var builder = WebApplication.CreateBuilder(args);
30+
31+
// Register the SEO services with configuration
32+
builder.Services.AddXperienceCommunitySEO(options =>
33+
{
34+
options.ReusableSchemaName = "PageMetadata"; // Your reusable schema name
35+
options.DefaultLanguage = "en-US";
36+
options.DescriptionFieldName = "MetaDescription";
37+
options.TitleFieldName = "MetaTitle";
38+
options.SitemapShowFieldName = "ShowInSitemap"; // Optional field
39+
options.ContentTypeDependencies = new[]
40+
{
41+
"BlogPost",
42+
"Article",
43+
"LandingPage"
44+
};
45+
});
46+
47+
var app = builder.Build();
48+
49+
// Your middleware configuration...
50+
app.MapControllers();
51+
52+
app.Run();
53+
```
54+
55+
## Usage Examples
56+
57+
### 1. Basic Controller Example
58+
59+
```csharp
60+
using Microsoft.AspNetCore.Mvc;
61+
using XperienceCommunity.SEO.Services;
62+
63+
[ApiController]
64+
public class SEOController : ControllerBase
65+
{
66+
private readonly IWebsiteDiscoveryProvider _websiteDiscoveryProvider;
67+
68+
public SEOController(IWebsiteDiscoveryProvider websiteDiscoveryProvider)
69+
{
70+
_websiteDiscoveryProvider = websiteDiscoveryProvider;
71+
}
72+
73+
// Generates sitemap.xml at /sitemap.xml
74+
[HttpGet("/sitemap.xml")]
75+
[ResponseCache(Duration = 3600)] // Cache for 1 hour
76+
public async Task<ActionResult> GetSitemap()
77+
{
78+
return await _websiteDiscoveryProvider.GenerateSitemap();
79+
}
80+
81+
// Generates llms.txt at /llms.txt
82+
[HttpGet("/llms.txt")]
83+
[ResponseCache(Duration = 3600)] // Cache for 1 hour
84+
public async Task<ActionResult> GetLlmsTxt()
85+
{
86+
return await _websiteDiscoveryProvider.GenerateLlmsTxt();
87+
}
88+
89+
// Generates robots.txt at /robots.txt
90+
[HttpGet("/robots.txt")]
91+
[ResponseCache(Duration = 86400)] // Cache for 24 hours
92+
public ActionResult GetRobotsTxt()
93+
{
94+
return _websiteDiscoveryProvider.GenerateRobotsTxt();
95+
}
96+
}
97+
```
98+
99+
### 2. Using Minimal APIs
100+
101+
```csharp
102+
app.MapGet("/sitemap.xml", async (IWebsiteDiscoveryProvider provider, HttpContext context) =>
103+
{
104+
var actionResult = await provider.GenerateSitemap();
105+
await actionResult.ExecuteResultAsync(new ActionContext
106+
{
107+
HttpContext = context
108+
});
109+
});
110+
111+
app.MapGet("/llms.txt", async (IWebsiteDiscoveryProvider provider, HttpContext context) =>
112+
{
113+
var actionResult = await provider.GenerateLlmsTxt();
114+
await actionResult.ExecuteResultAsync(new ActionContext
115+
{
116+
HttpContext = context
117+
});
118+
});
119+
120+
app.MapGet("/robots.txt", (IWebsiteDiscoveryProvider provider, HttpContext context) =>
121+
{
122+
var robotsContent = provider.GenerateRobotsTxt();
123+
return Results.Content(robotsContent, "text/plain; charset=utf-8");
124+
});
125+
```
126+
127+
### 3. Using Route Attributes
128+
129+
```csharp
130+
[Route("seo")]
131+
public class SEOController : ControllerBase
132+
{
133+
private readonly IWebsiteDiscoveryProvider _provider;
134+
135+
public SEOController(IWebsiteDiscoveryProvider provider)
136+
{
137+
_provider = provider;
138+
}
139+
140+
[HttpGet("~/sitemap.xml")] // ~/ makes it root-relative
141+
public async Task<ActionResult> Sitemap()
142+
=> await _provider.GenerateSitemap();
143+
144+
[HttpGet("~/llms.txt")] // ~/ makes it root-relative
145+
public async Task<ActionResult> LlmsTxt()
146+
=> await _provider.GenerateLlmsTxt();
147+
148+
[HttpGet("~/robots.txt")] // ~/ makes it root-relative
149+
public ActionResult RobotsTxt()
150+
=> _provider.GenerateRobotsTxt();
151+
}
152+
```
153+
154+
## Configuration for robots.txt
155+
156+
Add to your `appsettings.json`:
157+
158+
```json
159+
{
160+
"XperienceCommunitySEO": {
161+
"RobotsContent": "User-agent: Twitterbot\nDisallow:\n\nUser-agent: SiteAuditBot\nAllow: /\n\nUser-agent: *\nDisallow: /"
162+
}
163+
}
164+
```
165+
166+
For production, you could use the following sample:
167+
168+
```json
169+
{
170+
"XperienceCommunitySEO": {
171+
"RobotsContent": "User-agent: *\nAllow: /"
172+
}
173+
}
174+
```
175+
176+
## Expected Output
177+
178+
### robots.txt (Non-production)
179+
```
180+
User-agent: Twitterbot
181+
Disallow:
182+
183+
User-agent: SiteAuditBot
184+
Allow: /
185+
186+
User-agent: *
187+
Disallow: /
188+
```
189+
190+
### robots.txt (Production)
191+
```
192+
User-agent: *
193+
Allow: /
194+
```
195+
196+
### sitemap.xml
197+
```xml
198+
<?xml version="1.0" encoding="utf-8"?>
199+
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
200+
<url>
201+
<loc>https://yoursite.com/about</loc>
202+
<lastmod>2025-10-03</lastmod>
203+
<changefreq>weekly</changefreq>
204+
</url>
205+
<url>
206+
<loc>https://yoursite.com/blog/article</loc>
207+
<lastmod>2025-10-03</lastmod>
208+
<changefreq>weekly</changefreq>
209+
</url>
210+
</urlset>
211+
```
212+
213+
### llms.txt
214+
```
215+
# YourWebsiteName
216+
217+
## Pages
218+
219+
- [About Us](https://yoursite.com/about): Learn about our company and mission
220+
- [Blog Article](https://yoursite.com/blog/article): Comprehensive guide to SEO
221+
- [Contact](https://yoursite.com/contact): Get in touch with our team
222+
```
223+
224+
## Advanced Usage - Custom Sitemap Generation
225+
226+
The `IWebsiteDiscoveryProvider` service also exposes public methods that allow you to retrieve sitemap data and create custom implementations:
227+
228+
### Available Methods
229+
230+
- **`GetSitemapPages()`** - Returns a list of `SitemapNode` objects for generating XML sitemaps
231+
- **`GetSitemapPagesWithDetails()`** - Returns a list of `SitemapPage` objects with additional metadata like titles and descriptions
232+
233+
### Custom Sitemap Example
234+
235+
```csharp
236+
[ApiController]
237+
public class CustomSEOController : ControllerBase
238+
{
239+
private readonly IWebsiteDiscoveryProvider _provider;
240+
241+
public CustomSEOController(IWebsiteDiscoveryProvider provider)
242+
{
243+
_provider = provider;
244+
}
245+
246+
[HttpGet("/custom-sitemap.xml")]
247+
public async Task<ActionResult> GetCustomSitemap()
248+
{
249+
// Get the basic sitemap nodes
250+
var sitemapNodes = await _provider.GetSitemapPages();
251+
252+
// Customize the nodes (e.g., add custom change frequency, priority, etc.)
253+
foreach (var node in sitemapNodes)
254+
{
255+
if (node.Url.Contains("/blog/"))
256+
{
257+
node.ChangeFrequency = ChangeFrequency.Daily;
258+
node.Priority = 0.8;
259+
}
260+
else if (node.Url.Contains("/news/"))
261+
{
262+
node.ChangeFrequency = ChangeFrequency.Hourly;
263+
node.Priority = 0.9;
264+
}
265+
}
266+
267+
// Generate custom sitemap XML
268+
return new SitemapProvider().CreateSitemap(new SitemapModel(sitemapNodes));
269+
}
270+
271+
[HttpGet("/pages-with-metadata.json")]
272+
public async Task<ActionResult> GetPagesWithMetadata()
273+
{
274+
// Get detailed page information including titles and descriptions
275+
var pagesWithDetails = await _provider.GetSitemapPagesWithDetails();
276+
277+
// Transform or filter the data as needed
278+
var customData = pagesWithDetails.Select(page => new
279+
{
280+
Url = page.SystemFields.WebPageUrlPath,
281+
Title = page.Title,
282+
Description = page.Description,
283+
LastModified = DateTime.Now
284+
});
285+
286+
return Ok(customData);
287+
}
288+
}
289+
```
290+
291+
### Data Models
292+
293+
**SitemapNode** contains:
294+
- `Url` - The page URL path
295+
- `LastModificationDate` - When the page was last modified
296+
- `ChangeFrequency` - How often the page changes
297+
- `Priority` - Page priority (0.0 to 1.0)
298+
299+
**SitemapPage** contains:
300+
- `SystemFields` - System information about the web page
301+
- `Title` - The page title from your configured title field
302+
- `Description` - The page description from your configured description field
303+
- `IsInSitemap` - Whether the page should be included in sitemaps
304+
305+
## Testing
306+
307+
You can test the endpoints using curl or your browser:
308+
309+
```bash
310+
# Get robots.txt
311+
curl https://localhost:5001/robots.txt
312+
313+
# Get sitemap
314+
curl https://localhost:5001/sitemap.xml
315+
316+
# Get llms.txt
317+
curl https://localhost:5001/llms.txt
318+
```
319+
320+
## License
321+
322+
MIT License - see [LICENSE](LICENSE) for details.

src/GlobalUsing.cs

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2,11 +2,10 @@
22
global using CMS.Helpers;
33
global using CMS.Websites;
44
global using CMS.Websites.Routing;
5-
6-
global using Microsoft.AspNetCore.Builder;
75
global using Microsoft.AspNetCore.Http;
86
global using Microsoft.AspNetCore.Mvc;
9-
global using Microsoft.AspNetCore.Routing;
7+
global using System.Text.RegularExpressions;
8+
global using Microsoft.Extensions.Configuration;
109
global using Microsoft.Extensions.DependencyInjection;
1110

1211
global using SimpleMvcSitemap;

0 commit comments

Comments
 (0)