[SQL] 15 days of learning
![[SQL] 15 days of learning](/content/images/size/w2000/2025/02/15-day-lerning.png)
Problem Statement
Julia conducted a 15 days of learning SQL contest. The start date of the contest was March 01, 2016 and the end date was March 15, 2016.
Write a query to print total number of unique hackers who made at least 1 submission each day (starting on the first day of the contest), and find the hacker_id and name of the hacker who made maximum number of submissions each day. If more than one such hacker has a maximum number of submissions, print the lowest hacker_id. The query should print this information for each day of the contest, sorted by the date.
Sample Output
2016-03-01 4 20703 Angela
2016-03-02 2 79722 Michael
2016-03-03 2 20703 Angela
2016-03-04 2 20703 Angela
2016-03-05 1 36396 Frank
2016-03-06 1 20703 Angela
Find the rest of the information through here: LINK
[Solving problem]: Understanding the Data:
Our data lives in 2 tables:
Hackers (hacker_id, name)
Submissions (submission_date, submission_id, hacker_id, score)
Breaking down the solution
The first step is to identify participants who submitted their work every day starting from the contest’s first day. To do this, we use window functions to rank submission dates globally and per hacker. The query snippet below shows how this is achieved:
SELECT submission_date, hacker_id,
DENSE_RANK() OVER(ORDER BY submission_date) as subDateRank,
DENSE_RANK() OVER(PARTITION BY hacker_id ORDER BY submission_date) as hacIdRank
FROM submissions
DENSE_RANK
is a function which assigns a rank based on the order of dates.
subDateRank
counts the days of the contest.hacIdRank
counts the submission days for each hacker
For example, consider the participation of two contestants, Angela (20703) and Michael (79722):
Angela's participation pattern:
| submission_date | submission_id | subDateRank | hacIdRank |
|---|---|---|---|
| 2016-03-01 | 20703 | 1 | 1 |
| 2016-03-02 | 20703 | 2 | 2 |
| 2016-03-03 | 20703 | 3 | 3 |
| 2016-03-04 | 20703 | 4 | 4 |
| 2016-03-05 | 20703 | 5 | 5 |
| 2016-03-06 | 20703 | 6 | 6 |
Michael's participation pattern:
| submission_date | submission_id | subDateRank | hacIdRank |
|---|---|---|---|
| 2016-03-01 | 79722 | 1 | 1 |
| 2016-03-02 | 79722 | 2 | 2 |
| 2016-03-03 | 79722 | 3 | 3 |
| 2016-03-04 | 79722 | 4 | 4 |
| 2016-03-05 | 79722 | 5 | null |
| 2016-03-06 | 79722 | 6 | null |
By comparing subDateRank with hacIdRank, we can confirm whether a hacker's submitted their work every starting from day 1.
The second part would be to calculate the total number of unique hackers who maintained a consecutive submission. We therefore use COUNT
and DISTINCT
for this purpose.
SELECT
submission_date,
COUNT(DISTINCT hacker_id) as TotalHackers
FROM etc -------
WHERE subDateRank = hacIdRank
GROUP BY submission_date
Next, we need to determine the hacker with the highest number of submissions each day. We can achieve this by grouping submissions by date and then ranking the hackers using the RANK
window function.
RANK()
OVER(PARTITION BY sub.submission_date ORDER BY COUNT(*) DESC, hac.hacker_id ASC)
AS hackerRank
Finally, we join the results from MaxSubmissions with UniqueSubmissions. This gives us, for each day, the submission date, the total number of unique hackers with consecutive participation, the hacker id and hacker name. Then, we filter to only include the highest ranked participator.
SELECT maxSub.submission_date, uniqueSub.TotalHackers, maxSub.hackerId, maxSub.hackerName
FROM MaxSubmissions AS maxSub
JOIN UniqueSubmissions AS uniqueSub
ON maxSub.submission_date = uniqueSub.submission_date
WHERE hackerRank = 1
ORDER BY maxSub.submission_date
Full code:
/*
Enter your query here.
1st march 2016 to 15th march 2016
total no of unique hackers who made atleast 1 submission each day
*/
WITH UniqueSubmissions AS (
SELECT
submission_date,
COUNT(DISTINCT hacker_id) as TotalHackers
FROM (
SELECT submission_date, hacker_id,
DENSE_RANK() OVER(ORDER BY submission_date) as subDateRank,
DENSE_RANK() OVER(PARTITION BY hacker_id ORDER BY submission_date) as hacIdRank
FROM submissions
) temp
WHERE subDateRank = hacIdRank
GROUP BY submission_date
),
/*
find the hacker_id and name of the hacker
who made maximum number of submissions each day
*/
MaxSubmissions AS
(
SELECT sub.submission_date as submission_date, hac.hacker_id as hackerId, hac.name as hackerName, COUNT(*) AS submission_count,
RANK()
OVER(PARTITION BY sub.submission_date ORDER BY COUNT(*) DESC, hac.hacker_id ASC)
AS hackerRank
FROM HACKERS AS hac
JOIN Submissions AS sub
ON hac.hacker_id = sub.hacker_id
GROUP BY sub.submission_date, hac.hacker_id, hac.name
)
SELECT maxSub.submission_date, uniqueSub.TotalHackers, maxSub.hackerId, maxSub.hackerName
FROM MaxSubmissions AS maxSub
JOIN UniqueSubmissions AS uniqueSub
ON maxSub.submission_date = uniqueSub.submission_date
WHERE hackerRank = 1
ORDER BY maxSub.submission_date
;
This solution has a lot of SQL concepts, including; Joins, Window Functions, Subqueries and CTEs.
So I hope that you have enjoyed learning something new today. If you would like a deeper dive into any of these topics, whether through a blog post or a YouTube video, feel free to leave a comment. Thank you ;)
About Me
I am Zaahra, a Google Women Techmakers Ambassador who enjoy mentoring people and writing about technical contents that might help people in their developer journey. I also enjoy building stuffs to solve real life problems.
To reach me:
LinkedIn: https://www.linkedin.com/in/faatimah-iz-zaahra-m-0670881a1/
X (previously Twitter): _fz3hra
GitHub: https://github.com/fz3hra
Cheers,
Umme Faatimah-Iz-Zaahra Mujore | Google Women TechMakers Ambassador | Software Engineer