this project is can be found on GitHub

The purpose of this project was to create a small script that can find all the links in a directory using grep and checking that they are still valid links but checking the http status code of the link from a curl request

The main script in this repo is the file that will generate two files.

  • live.txt — containing all the live links
  • dead.txt — containing all the dead links

Any status code that is 400 or greater is considered a dead link.

I wrote this script while I was working as an offensive security engineer and doing penetration tests each week. I used it as one of a few other preliminary checks when looking at a new codebase.

Possible Future Work

  • Creating a Pre-commit hook that could automatically check for dead links before a push
  • Creating a GitHub Action script that can run the script as part of a CI pipeline
  • Checking the git cache of a repo to only check files that have been changed