Skip to content

LordTobyJ/Shorthand-challenge

Repository files navigation

Project Specification

This project is a write up of the takehome test Shorthand-Challenge. The solution is written in Typescript, utilising an Express API framework with a companion CLI tool (as per the requirements).

The project, despite not being expanded upon, has been written as if it were being added to the Shorthand production system, with the exception of a few "Dev Notes" throughout the project. These notes occur sparingly, where a decision was made beyond the initial scope of the project, or where it was determined that further deliberation might be desired.

I chose typescript for this project because in my opinion it brings out the best of JavaScript's versatility, whilst offering some of the more structured elements of C#.

Wherever possible, I have attempted to make it easy to traverse the project both mentally and via IDE. This includes function names that are overt and considerations into Locality of Behaviour. A number of Known Constraints exist for the project, which were either considered out of scope, or add unnecessary complexity to the project. As a general rule I added an item to the Known Constraint if it was likely to be a decision that would need a conversation whilst on the job.

Installation / Usage

The project requires Node 18 to compile and run; or Docker with docker compose to run and test.

To see the project running, use the command npm install then use command npm run dev, or docker-compose up -d --build. By default the server will run on localhost:5000 (the port number can be changed in the .env file).

To run the CLI tool, run npm run build:cli then you can use the command cli get-structure <url> to generate and return your semantic structures. The CLI offers an output functionality switch -o that takes a file path to save the output to file (eg. cli get-structure https://en.wikipedia.org/wiki/JavaScript -o test.json)

To run tests, use the command npm run test, or to see coverage run npm run test:coverage. You can also run in band (used primarily for debugging with a debugger utility) by using command npm run test:debug

Architecture

The project uses a docker compose containerisation approach to minimise difficulty when uploading to a cloud infrastructure platform with ease. The key assumption here is that the project could be uploaded into a K8 cluster or an auto-scaling grouped ECS cluster without needing extra changes.

Project Structure

The project is heavily isolated into localities of behaviour, splitting into multiple files by function and class, and uses folders liberally as a grouping mechanism for the files. The only top level source files are the index (which hosts the endpoint logic for the project) and the Global variables file. The thinking here is to help developers easily navigate the project and give them mental permission to extend easily. Admittedly, the types could be better organised, and the external structures have been grouped largely to remove the noise of similarly named type patterns. Given that the external types would be schema locked, you would expect to see these changed less often than other types.

├── src                                           (Holds the source files)
│   ├── functions                                 (Holds the key logic)
│   │   ├── GenerateSemanticStructure.ts
│   │   └── GenerateStructureFromURL.ts
│   ├── helpers                                   (Holds helper functions)
│   │   └── SemanticStructureHelpers.ts
│   ├── types                                     (Holds the internal types and interfaces)
│   │   ├── external                              (Holds the Schema-locked External types)
│   │   │   ├── SemanticStructureExternal.ts
│   │   │   ├── SemanticStructureNodeExternal.ts
│   │   │   └── SkippedLevelNodeExternal.ts
│   │   ├── Fixture.ts
│   │   ├── SemanticStructure.ts
│   │   ├── SemanticStructureNode.ts
│   │   ├── SemanticStructureNodeChildless.ts
│   │   ├── SkippedLevelsNode.ts
│   │   ├── Stack.ts
│   │   └── SturctureOutput.ts
│   ├── GLOBALS.ts                                (Holds global variables)
│   └── index.ts                                  (Holds the endpoint logic for the express API)
├── tests                                         (Holds the tests)
│   ├── fixtures                                  (Holds the test fixtures)
│   │   ├── GenerateSemanticStructure.ts
│   │   ├── index.ts
│   │   └── SemanticStructureHelpers.ts
│   ├── functions                                 (Holds the test files for the function source files)
│   │   └── GenerateSemanticStructure.test.ts
│   ├── helpers                                   (Holds the test files for the helper source files)
│   │   └── SemanticStructureHelpers.test.ts
│   └── index.test.ts
├── .env
├── cli.ts
├── docker-compose.yml
├── Dockerfile
├── jest.config.ts
├── package.json
├── README.md
└── tsconfig.json

Testing

The project uses jest to run unit tests and boasts 56 tests over 3 suites, offering a 98.85% coverage, with a single line not covered (the app.listen command, which is purposefully turned off for tests to make sure there are no open handles).

To run tests, see the Installation / Usage section above.

Opportunities for Improvement

Notably missing from the project is a frontend and a cloudformation yml for quick deployment. These were out of scope for the project and I chose to make an earlier v1 release in place of allowing scope creep for exciting features.

Known Constraints

Malformed HTML

This solution does nothing to handling malformed HTML, and considers it out of scope. Cheerio is considered a resilient solution for handling HTML, and I am happy to rely on it for well organised HTML pages. At some point you have to draw a box of what is contollable by your application, and Malformed HTML falls squarely outside of that.

TLD Constraint

Because some non-TLD addresses exist (such as localhost and IPv4 addresses), the valdiator.js package that I am using to offload the URL validation process does not handle for missing TLDs. Given the scope of the project, I opted to forgo including a process to cover this bug.

Case Insensitivity

For the sake of time constraint and because it was not clarified in the original instructions; I have marked query parameterised case sensitivity and page name sensitivity as out of scope for this project. For example, a url could include uppercase variations in its query parameters, or in its path (ie "https://foo.com/?a=foo&A=bar").

XHTML + XML support

The current implementation only handles pages served with a text/html content type. Pages served as application/xhtml+xml or other XML-based content types are not processed and will return an error. Handling these formats would require an XML parser and additional logic to interpret heading elements, which is considered out of scope for this project.

General Handling of Known Constraints

In a production environment, bugs like these would be flagged as known issues and triaged to determine whether they are of considerable enough importance as to be solved, backlogged, or accepted.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published