Your code runs on our servers.
Here's exactly what happens.
No magic and no hand-waving: every submission takes the same journey, with multiple independent safety layers along the way. That's also why the metrics are trustworthy โ they're measured on the server, next to the code, where the browser can't fake them.
The pipeline
The journey of a submission
From your keystroke to a graded report โ five stops, each one assuming the previous could be wrong.
Write
C# in a full editor, right in your browser. Run the samples as often as you like.
Screen
Every submission is checked before anything runs. Hostile code never executes.
Compile
Built server-side with the same compiler you use locally โ full diagnostics on errors.
Run, sealed
Executed in a disposable, fully isolated sandbox. One per submission.
Measure & grade
Time, memory, and correctness โ measured on the server, where they can't be faked.
Seconds, end to end: the snippet goes out, the graded report comes back โ per-test timings, budgets, memory, and diffs.
Containment
Inside the sandbox
We treat every submission as hostile โ including yours. That's a feature: the same walls that contain malicious code make the grading fair and the timings clean.
$ submit Solution.cs
โธ sandbox spawned fresh ยท single-use
# security hardening โ applied to every run
โ network isolated
โ file system read-only
โ cpu ยท memory ยท processes capped
โธ tests complete 148 ms
โธ sandbox destroyed nothing persists
โฎ
This exact lifecycle runs for every submission โ spawn, seal, run, measure, destroy. There is no long-lived server your code shares.
The grading
How each track grades you
Track 01
โก Algorithms
Correct gets you halfway. The hidden suite scales the input until complexity decides the outcome.
- ยท Visible sample cases plus a hidden suite with inputs large enough that complexity decides the outcome.
- ยท Per-test time budgets: correct-but-slow times out where the efficient solution passes with room to spare.
- ยท Allocation tracking on every run โ and on some puzzles a hard allocation budget, where the copy-and-reverse approach fails and the in-place one passes.
The N+1 starter
โฆ ร41 round-trips
โ Timeout over budget
The set-based rewrite
1 round-trip
โ Passed 12 ms
Track 02
๐๏ธ Database / EF
Your LINQ runs against a real database โ and the grader shows you what it really cost.
- ยท Enough data that inefficiency shows up in the timings, not just in code review.
- ยท Expand any test to see the queries your code actually produced โ and what they cost.
- ยท Plan-graded puzzles capture the execution plan the engine chose and grade it: full-table reads flagged in red, index usage in green. The right rows the wrong way fails.
method length
limit 25 lines
cyclomatic complexity
limit 8
nesting depth
limit 3 levels
โ 9 / 9 tests still passing โ behavior never changed
Track 03
๐งน Refactoring
Working code you'd hate to inherit. Make it clean โ without changing what it does.
- ยท The behavioral tests pass before you touch anything, and they must still pass when you're done.
- ยท Structural gates measured from your source: method length, cyclomatic complexity, nesting depth, duplicate blocks.
- ยท Every flavor of real-world mess โ tangled conditionals, god methods, copy-paste blocks, arrow code โ including the famous Gilded Rose kata.
Infrastructure
database, email, HTTP
Application
use cases
Domain
pure โ depends on nothing
Track 04
๐๏ธ Architecture
Multi-file refactoring katas, graded on behavior and design together.
- ยท Dependency direction, layering, and abstraction boundaries are verified automatically.
- ยท A wrong dependency fails the submission the same way a failing test does.
- ยท Feedback in seconds โ not in a code review three weeks later.
The adversarial suite
../../etc/passwd
path traversal
โ denied
evil-example.com
suffix confusion
โ rejected
aaaaaaaaaaaaaaaaaaaaaa!
ReDoS
โ 0.2 ms, no hang
ada\n[INFO] admin granted
log forging
โ escaped
Track 05
๐ก๏ธ Secure Coding
Functionally correct, quietly exploitable. Close the hole without breaking the feature.
- ยท Two suites grade every submission: functional tests prove the feature still works, adversarial tests throw real attack payloads at it.
- ยท The payloads are the classics that hit production systems: path traversal, Zip Slip, log forging, injection, suffix confusion.
- ยท Resource-exhaustion attacks are graded with real budgets โ a ReDoS input must return in milliseconds, a decompression bomb must be rejected within a memory cap.
Your code stays yours
Submissions are stored so you can see your own history and progress โ that's it. We don't publish them, and the sandbox they ran in is gone seconds after they finish. The details live in the Privacy Policy.
See the grading for yourself
One puzzle is all it takes to understand why server-measured beats green checkmarks.
Start solving โ it's free