Skip to content

Add support for linguist- marks in .gitattributes (adresses #386)#1292

Open
JJWRoeloffs wants to merge 2 commits intoXAMPPRocky:masterfrom
JJWRoeloffs:add_linguist_support
Open

Add support for linguist- marks in .gitattributes (adresses #386)#1292
JJWRoeloffs wants to merge 2 commits intoXAMPPRocky:masterfrom
JJWRoeloffs:add_linguist_support

Conversation

@JJWRoeloffs
Copy link

@JJWRoeloffs JJWRoeloffs commented Nov 10, 2025

I noticed tokei does not support the linguist-vendorred, linguist-documenttaion, and linguist-generated marks you can put in your .gitattributes while working on a project of mine, and then found that the issue for this was listed as open and help-wanted (#386), (even a few years later). With this pull request, I attempted to implement this linguist support in the way outlined in the original issue.

I haven't programmed in Rust in a while, so I am a bit... rusty (pun absolutely intended.) I hope I did everything properly! ^.^


In this pull request I am:

  • Adding the gix-attributes dependency for .gitattributes parsing, as specified in the original issue.
  • Marking all things with linguist-vendorred, linguist-documenttaion, and linguist-generated in .gitattributes to-ignore using overrides.
    • Adding a test that makes sure this works.
  • Adding a new commandline option --no-ignore-linguist and its documentation, which disables the functionality I added.

EDIT: Coming back a day later, I realize there are some complications that I missed: If you pass multiple directories to tokei, the expected behavior is that each of these directories has its own ignore files that apply locally to the files in those directories, which is indeed how the .gitignore is implemented in the ignore crate. However, with how I implemented gitattributes, any found rule in a gitattribute file applies globally to the entire run, even to separately passed directories. I am expecting there to be a decent fix for this, but I'll wait for your comment before putting more time into this PR.
(It also appears I flipped the ignore parent flag.)

And fix the problem in my code that I found that way, `linguist-ignore`
is a variable that should be true when we _do_ want to use the ignore
from the linguist marks in the .gitattributes file
@XAMPPRocky
Copy link
Owner

Thank you for your PR!

I am expecting there to be a decent fix for this, but I'll wait for your comment before putting more time into this PR.
(It also appears I flipped the ignore parent flag.)

I would expect it to behave the same as ignore.

@spenserblack
Copy link
Contributor

Stumbled on this while browsing, my 2 cents:

  • Adding a new commandline option --no-ignore-linguist and its documentation, which disables the functionality I added.

I don't think you want to group these 3 attributes together. Especially linguist-documentation; I imagine that users of tokei would want to collect stats on their documentation, even if they're ignoring generated or vendored files. "These files are documentation" doesn't always mean "I don't care about the stats for these files." And they may find it confusing that you are matching Linguist's behavior when these attributes are explicitly set, but not when they're implicitly set (e.g. Linguist detects anything in a docs/ folder as documentation and ignores it).

This PR as it is is putting the cart before the horse IMO, because tokei doesn't have the precedent of ignoring documentation, generated, or vendored files AFAIK. This PR would make more sense if tokei had CLI flags like --[no-]ignore-documentation IMO. It just strikes me as unusual that tokei would respect Linguist's custom attributes when it otherwise behaves very differently from Linguist.

Also, there's the linguist-detectable attribute. If you're supporting these attributes, you should probably also support -linguist-detectable (modern usage) and linguist-detectable=false (old usage) for ignoring files.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants