Edited: because I failed to proof read.
In this article we will take a quick look at how we can use Trusted Timestamps to better establish the provenance of student (coding) assignments.
The Problem
I often assess the attainment of learning outcomes using ‘coding projects’. The coding projects I design are styled as:
openbook: Students are free to use any resources to help them complete the project;
takehome: Students are free to do the project in places that suit them best (the lab or at home);
Unfortunately, with such styled projects this leaves open the provenance of each student submission’s development history. When assessing attainment we, as instructors, must be sure that the student themselves did indeed create the coursework.
Factors that we must consider are:
- Did the student use Generative AI?
- Did the student plagiarise their work from other sources?
Although we use plagiarism detection tools such as TurnItIn (or Urkind or JPlag), they only do so much and can report false positives.
A related problem is when students claim they submitted the ‘wrong submission’. When presented with the ‘original’ submission by the student we need to verify that the submission was as presented prior to the submission deadline.
DVCS as a partial solution
During my teaching career I have seen how beneficial distributed version control systems have been in attesting provenance.
Importantly, we can use DVCSs as a means to capture the development history of a project. Unfortunately, such history is weak as DVCs enable one to rewrite history as one pleases. Thus, on their own DVCS do not offer strong submission provenance.
Coding forges as a partial solution
Since joining Strathclyde I have seen colleagues use coding forges (GitHub and a local Gitlab instance) for host student submissions. Students then submit a URL via the coursework submission tool.
An interesting aspect of using forges is that the development history is hosted by a third-party. As long as the student pushes to the remote forge the instructors can view the history. A force push of new history, unfortunately, removes that certainty.
Trusted Infrastructure as a partial solution
Recently, I have learned how colleagues have used evidence from OneDrive storage as a means to attest provenance. That is, we can use the file modification times on OneDrive as a means to assert provenance. Such provenance is, however, weak as if the student touches the files then the timestamps are no longer valid.
When I was at St Andrews,
however,
I encountered a way to provide stronger provenance using a specially deployed forge.
At St Andrews,
students had access to a Mercurial service built-in to the user accounts.
Specifically,
each valid user account
(staff and student) had a Mercurial remote (.hg
, IIRC) exposed as:
username.hg.cs.sta.ac.uk
Students could use their accounts as a means to host their coursework. Importantly, user accounts were hosted/managed/controlled using a ZFS solution. From which snapshots were created for back up purposes.
An important consequence of this setup was that: if a student submitted the ‘wrong’ submission and had they used the Schools provided Mercurial service, we could checkout a version of their coding project before the submission deadline and use the restored submission as the basis for marking. Given that we operated the infrastructure, the provenance/trust of the submission could be attested strongly.
This setup was not ideal as using gitlab/github offers a better development experience.
Trusted Attestation
Recently, I have been reading about the attestation of software supply chains. Specifically:
The core idea being that a trusted third-party provides attestation of software artefact generation. One way we can provide attestation is by using trusted timestamps to timestamp important stages in the software development lifecycle. A trusted timestamp is timestamped data that is digitally signed by a trusted third-party. Typically, the signed data is a hash.
In fact, trusted timestamps are useful for Digital Forensics!
https://www.metaspike.com/trusted-timestamping-rfc-3161-digital-forensics/
The process of trusted timestamps, including file formats and protocols, have been encoded within RFC3161. There are many Timestamp Authorities one can chose from:
- https://freetsa.org/index_en.php
- https://knowledge.digicert.com/general-information/rfc3161-compliant-time-stamp-authority-server
Not to mention, that we can use OpenSSL and CURL as client software to interact with TimeStamp Authorities and Timestamps themselves.
The trusted timestamp protocol is a simple request-response interaction. We request the trusted timestamp authority to sign some data (hash), and the trusted timestamp authority responds with a signed version of the data.
Integration of Trusted Timestamps into the Development Process
We know that DVCSs offer a record of development history but that record cannot be fully trusted.
Integrating trusted timestamps at each commit enables non-repudiation of that history.
GitTustedTimestamps
is a git postcommit hook that associates each commit with a trusted timestamp.
https://medium.com/swlh/git-as-cryptographically-tamperproof-file-archive-using-chained-rfc3161-timestamps-ad15836b883 https://github.com/MrMabulous/GitTrustedTimestamps/
Unfortunately,
GitTrustedTimestamp
is itself a heavy weight solution,
as each rebase version rewrite needs to be re-trusted.
Solutions such as SALSA are also heavy weight when looking to integrated trusted timestamps into the development process. We need to provide quite complex infrastructure that includes CI servers and actions. (Having the infrastructure to do so within a University would be quite nice though!) A key take-away from SALSA, however, is that trusted timestamps are created at important milestones such as version tagging and artifact generation.
Our approach to asserting the provenance of student submissions will be similar: enable insertion of trusted timestamps at important milestones. Specifically, the most important milestone of all: submission.
Intermezzo: Only working on Clean Trees
To help our students we need to ensure that all operations are performed on a clean working tree.
I was fortunate to have found
a simple solution that integrates into a Makefile
,
enabling us to block targets when the tree is dirty.
.PHONY: status
status:
@status=$$(git status --porcelain); \
if [ ! -z "$${status}" ]; \
then \
echo "Error - working directory is dirty. Commit those changes!"; \
exit 1; \
fi
A target to timestamp projects.
Our solution will borrow from GitTrustedTimestamp,
and insert commits recording trusted timestamps of the project’s current status.
Like GitTrustedTimestamp we will use standard utilities
(openssl
, curl
, and friends)
for timestamp generation.
We begin by generating a ‘unique’ name to label the binary request/response.
STAMP := $(shell date +%Y-%m-%d-%H-%M-%S)
We will use the Free Timestamp Authority.
TrustedTimestampAuthority := https://freetsa.org/tsr
We will then start writing our timestamp
target that only works on clean trees:
timestamp: status
We will store all our timestamps into a ‘hidden’ folder called .timestamp
.
mkdir -p .timestamp
With the infrastructure written we now generate our timestamp request based on the hash of the current head.
@commit=$$(git rev-parse --verify HEAD); \
openssl ts -query \
-digest="$${commit}" -sha1 -cert \
-out .timestamp/$(STAMP)-$(REGNO).request.tsq
We requires -sha1
as the version of git on my machine still uses sha1
;
openssl
complains about size mismatches.
The request file
(.tsq
, with a unique name that also includes the student’s registration number)
can now be sent to the Timestamp Authority.
We do so using curl
:
curl -H "Content-Type: application/timestamp-query" \
--data-binary "@.timestamp/${STAMP}-${REGNO}.request.tsq" \
${TrustedTimestampAuthority} \
--output .timestamp/${STAMP}-${REGNO}.response.tsr
Like the request,
we store the response
(.tsr
; the timestamp itself)
as a binary file in our .timestamp
directory and give the response a unique name.
Next we need to construct a meaningful commit message, and store the request and responses in the commit itself.
We so by generating a commit message using the text version of the timestamp, together with a meaningful brief message.
echo "[ admin ][ tts ] establishing dev history provenance.\n\n" > $(STAMP).txt
openssl ts -reply -in .timestamp/$(STAMP)-$(REGNO).response.tsr -text >> $(STAMP).txt
We can then commit the request/response and remove the commit message.
git commit .timestamp/ --quiet -F "$(STAMP).txt"
rm $(STAMP).txt
Verifying Timestamps
To verify timestamps we need to obtain the verification keys for our Trusted Timestamp Authority. The keys for FreeTSA are online.
We can point our Makefile to them as follows:
TrustedTimestampAuthorityCertCA := ~/freetsa/cacert.pem
TrustedTimestampAuthorityCert := ~/freetsa/tsa.crt
I have dumped them in my home directory, where they live does not matter /per se/. It would, perhaps, be a good idea to bundle the certs with the project.
We can then use openssl
to verify any request/response pair in our .timestamp
directory.
.PHONY: ${shell ls .timestamp/*.tsr}
.timestamp/*.tsr:
openssl ts -verify \
-in $@ \
-queryfile ${subst response,request,${subst tsr,tsq,$@}} \
-CAfile ${TrustedTimestampAuthorityCertCA} \
-untrusted ${TrustedTimestampAuthorityCert}
What do students submit?
The timestamp
target inserts trusted timestamps in the development history.
A question now is what are we asking students to submit?
Well, we can ask students to submit links to their git repositories on their favourite forges. The trusted timestamps provide strong assurances that the development occurred when it did. If we ask students to sign their commits, we can get stronger authenticity claims, especially if we provided Students with signing keys!
make timestamp
git push remote origin main
Of course this approach will fail if students do not insert the timestamps themselves! Moreover, forges may be hosted by third-parties and third-parties can go offline.
A more self-contained approach would be to ask students to create a git-bundle that preserves the development history (including timestamps), and then submit this bundle to your favourite online submission site (Moodle/MMS/Blackboard/Canvas). We can even create a target to generate a bundle in which the latest commit is the trusted timestamp:
bundle: status timestamp
git bundle create $(REGNO).bundle
git verify $(REGNO).bundle
Appendix
Example Textual Trusted Timestamp.
The raw markdown source for the original post was timestamped!
Changes were made as I uploaded an older version of the Makefile
target.
Status info:
Status: Granted.
Status description: unspecified
Failure info: unspecified
TST info:
Version: 1
Policy OID: tsa_policy1
Hash Algorithm: sha256
Message data:
0000 - 27 64 54 0b d3 dd 6d 30-97 a8 32 76 d4 58 92 be 'dT...m0..2v.X..
0010 - 8c 16 8e d4 71 d0 c5 93-76 76 f9 5c 57 ce b9 f9 ....q...vv.\W...
Serial number: 0x065FAB92
Time stamp: Apr 24 20:38:59 2025 GMT
Accuracy: unspecified
Ordering: yes
Nonce: 0x7397E2316F9C0815
TSA: DirName:/O=Free TSA/OU=TSA/description=This certificate digitally signs documents and time stamp requests made using the freetsa.org online services/CN=www.freetsa.org/emailAddress=busilezas@gmail.com/L=Wuerzburg/C=DE/ST=Bayern
Extensions:
Example Makefile stub
REGNO := 202412345
TrustedTimestampAuthority := https://freetsa.org/tsr
## Intentionally left blank, can be provided along side the code or used later.
# These need to be instantiated when verifying timestamps
TrustedTimestampAuthorityCertCA :=
TrustedTimestampAuthorityCert :=
STAMP := $(shell date +%Y-%m-%d-%H-%M-%S)
.PHONY: ${shell ls .timestamp/*.tsr} timestamp bundle
## Insert a trusted timestamp at the HEAD of the project
timestamp:
mkdir -p .timestamp
@commit=$$(git rev-parse --verify HEAD); \
openssl ts -query \
-digest="$${commit}" -sha1 -cert \
-out .timestamp/$(STAMP)-$(REGNO).request.tsq
curl -H "Content-Type: application/timestamp-query" \
--data-binary "@.timestamp/${STAMP}-${REGNO}.request.tsq" \
${TrustedTimestampAuthority} \
--output .timestamp/${STAMP}-${REGNO}.response.tsr
echo "[ admin ][ tts ] establishing dev history provenance.\n\n" > ${STAMP}.txt
openssl ts -reply -in .timestamp/$(STAMP)-$(REGNO).response.tsr -text >> ${STAMP}.txt
git add .timestamp/
git commit .timestamp/ --quiet -F "$(STAMP).txt"
rm ${STAMP}.txt
## Verify a timestamp.
.timestamp/*.tsr:
openssl ts -verify \
-in $@ \
-queryfile ${subst response,request,${subst tsr,tsq,$@}} \
-CAfile ${TrustedTimestampAuthorityCertCA} \
-untrusted ${TrustedTimestampAuthorityCert}
## Create a git-bundle with a Trusted HEAD
bundle: status timestamp
git bundle create $(REGNO).bundle
git verify $(REGNO).bundle