question

victorgomezz avatar image
victorgomezz asked

Need help building TSQL scripts.

I was asked to build TSQL codes for below, as I am new I need help.

List of tables,

---------------------------------------------------------------------------------------------------------------------
Purchases

Fields:

Id

User_id – foreign key to users table

Price – integer in cents

Purchase_timestamp

Item_id – foreign key to items table

Users

Fields:

User_id

Gender

Age

Level

Country

Installed_timestamp (in UTC)

UserSource

CostToInstall (Price paid to get them)

ItemsPurchased

Fields:

Id

User_id

Type (enumeration of chip, coin, powerplay, combo)

Regular_price – integer in cents

Discount_price – integer in cents

Events

Fields:

Event_code (enumeration of GAME_PLAYED, LOGIN_BONUS_COLLECTED, TUTORIAL_STEP)

Event_timestamp

User_id foreign key to user table

Name – name of the game for GAME_PLAYED events

Number_game_cards_played – integer

Number_chips_used – integer – chips used or bet

Number_chips_won - integer

---------------------------------------------------------------------------------------------------------------------

1.Create SQL to find all sum of prices spent on purchases in the last 30 days for chips that were at a discount to the regular price.

2.Create SQL to find total of purchase prices by country by month for last 1 year for users that are at least level 20.

3.Create SQL to find how many games are played by day by people that have spent less than 20 dollars (1 dollar = 100 cents) in the previous 30 days to the game played. Do this over the last 90 days.

4.Create SQL to find which items have been bought fewer than 10 times per day.

5.Create SQL to find out how many streaks we have of users playing at least one game more than 7 days in a row.

6.SQL to determine the retention rate of new users. Retention being defined as they played a game more than 7 days after they installed. Summarize the overall rate by month based on the day they installed for the last year relative to today’s date.

7.SQL to determine the ROI (Return on Investment) of the users by month. ROI is defined as the Revenue in the first 7 days for each user divided by the Cost of the install of the user. Break this down to show the ROI per UserSource (Advertiser). Summarize the users for each month based on the install date of the user. The first seven days of a user may cross into the next month, but it would be counted in the install month. Show this by month for the last 12 months from today.

8.What is the Return to Player (RTP) of the game called “WinThemAll”? RTP is the ratio of chips won over the chips paid to play for the last 30 days.

tsql
1 comment
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

And what have you tried so far?

What bits are you stuck on?

First things to look at would be CREATE TABLE syntax, CREATE CONSTRAINT syntax etc.

0 Likes 0 ·
victorgomezz avatar image
victorgomezz answered
Here is what I did so far.....




CREATE TABLE dbo.Purchases
	(
	Id int NOT NULL,
	[User_Id] nvarchar(10) NOT NULL,
	Price int NOT NULL,
	Purchase_timestamp datetime NOT NULL,
	item_id int NOT NULL
	)  ON [PRIMARY]


--INSERT INTO DBO.Purchases(ID,User_Id,PRICE,item_id,Purchase_timestamp)
--SELECT	1	,	100	,	1000	,	1	,'1/5/2020' UNION
--SELECT	2	,	100	,	200	,	2	,'1/6/2020' UNION
--SELECT	3	,	100	,	5000	,	3	,'1/7/2020' UNION
--SELECT	4	,	100	,	1000	,	4	,'1/8/2020' UNION
--SELECT	5	,	100	,	450	,	1	,'1/9/2020' UNION
--SELECT	6	,	100	,	800	,	1	,'1/10/2020' UNION
--SELECT	7	,	100	,	100	,	1	,'1/11/2020' UNION
--SELECT	8	,	100	,	500	,	1	,'1/12/2020' UNION
--SELECT	9	,	100	,	2000	,	1	,'1/13/2020' UNION
--SELECT	10	,	101	,	5000	,	1	,'1/14/2020' UNION
--SELECT	11	,	101	,	150	,	2	,'1/15/2020' UNION
--SELECT	12	,	101	,	1000	,	2	,'1/16/2020' UNION
--SELECT	13	,	101	,	200	,	3	,'1/17/2020' UNION
--SELECT	14	,	101	,	5000	,	4	,'1/18/2020' UNION
--SELECT	15	,	101	,	1000	,	1	,'1/19/2020' UNION
--SELECT	16	,	101	,	450	,	2	,'1/9/2020' UNION
--SELECT	17	,	101	,	800	,	2	,'1/10/2020' UNION
--SELECT	18	,	101	,	100	,	3	,'1/11/2020' UNION
--SELECT	19	,	101	,	500	,	4	,'1/12/2020' UNION
--SELECT	20	,	101	,	2000	,	5	,'1/13/2020' UNION
--SELECT	21	,	101	,	5000	,	4	,'1/14/2020' UNION
--SELECT	22	,	101	,	150	,	1	,'1/5/2020' UNION
--SELECT	23	,	101	,	1000	,	2	,'1/6/2020' UNION
--SELECT	24	,	101	,	200	,	2	,'1/7/2020' UNION
--SELECT	25	,	101	,	5000	,	3	,'1/8/2020' UNION
--SELECT	26	,	101	,	1000	,	4	,'1/16/2020' UNION
--SELECT	27	,	101	,	450	,	5	,'1/17/2020' UNION
--SELECT	28	,	101	,	800	,	4	,'1/18/2020'






CREATE TABLE dbo.Users
	(
	[User_Id] nvarchar(10) NOT NULL PRIMARY KEY,
	Gender nvarchar(10),
	Age int,
	[Level] int,
	Country nvarchar(100),
	Installed_timestamp datetime,
	UserSoruce nvarchar(100),
	CostToInstalled int
	)  ON [PRIMARY]
--INSERT INTO DBO.Users
--SELECT	100	,'M',	32	,	30	,'IND','12/15/2019','Advertiser',	100	 UNION
--SELECT	101	,'F',	25	,	36	,'USA','1/15/2019','Other',	250	




CREATE TABLE dbo.ItemPurchased
	(
	Id int NOT NULL,
	[User_Id] nvarchar(10) NOT NULL,
	[Type] int,
	Regular_Price int,
	Discount_Price int
	)  ON [PRIMARY]
--INSERT INTO DBO.ItemPurchased
--SELECT	1	,	100	,'Chip',	100	,	100	 UNION
--SELECT	2	,	100	,'Coin',	500	,	100	 UNION
--SELECT	3	,	100	,'PowerPlay',	450	,	100	 UNION
--SELECT	4	,	100	,'Combo',	1000	,	100	 UNION
--SELECT	5	,	100	,'Chip',	250	,	100	 UNION
--SELECT	6	,	101	,'Chip',	800	,	100	 UNION
--SELECT	7	,	101	,'Coin',	700	,	100	 UNION
--SELECT	8	,	101	,'PowerPlay',	100	,	100	 UNION
--SELECT	9	,	101	,'Combo',	1000	,	100	 UNION
--SELECT	10	,	101	,'Chip',		5000	,	100	




CREATE TABLE dbo.[Events]
	(
	Event_code  int NOT NULL,
	Event_timestamp DATETIME,
	[User_Id] nvarchar(10) NOT NULL,
	Name  nvarchar(100),
	Number_game_cards_played  int,
	Number_chips_used  int,
	Number_chips_won  INT
	)  ON [PRIMARY]


CREATE TABLE dbo.[Items]
	(
	item_id  int NOT NULL,
	TypeDesc varchar(1000)
	)  ON [PRIMARY]


CREATE TABLE dbo.[EventType]
	(
	Eventtype  int NOT NULL,
	TypeDesc varchar(1000)
	)  ON [PRIMARY]


/*
Assumptions:


As I do not have a visibility of the tables' data I am making some assumption based on my understanding,


1) Items table - 
                 I am going with assumption that items table is ENUMERATION of item type when 
				 this gets loaded from the site drop down for CHIP, COIN, POWERPLAY, COMBO. This also can be joined with ItemsPurchased table on Type.
2) Type Enumeration - In ItemsPurchased table I am making Type as Int which can be joined on ITEMS table via type to item_id 
3) Event_code - I am assuming this as Int and creating a seperate table for EventType where I store GAME_PLAYED, LOGIN_BONUS_COLLECTED and TUTORIAL_STEP
				and join with EVENTS table on Event_code and Eventid


*/
   


--1
Select Sum(Price) as TotalPrice
From dbo.Purchases A 
                     inner join dbo.ItemPurchased B ON A.item_id=B.Id
					 inner join dbo.Items C ON B.[Type]=C.Item_id 
Where CAST(A.Purchase_timestamp AS DATE)>= DATEADD(DAY,-30,CAST(GETDATE() AS DATE)) AND 
      ISNULL(B.Discount_Price,0)<ISNULL(B.Regular_Price,0) AND
	  C.[TypeDesc] = 'CHIP' 


--2
Select B.Country,SUM(A.PRICE) as TotalPrice,MONTH(A.Purchase_timestamp) AS MonthNo,YEAR(A.Purchase_timestamp) AS YearNo
FROM DBO.Purchases A INNER JOIN DBO.USERS B ON A.[User_Id]=B.[User_Id]
WHERE B.[Level]>=20 AND CAST(A.Purchase_timestamp AS DATE)>= DATEADD(YEAR,-1,CAST(GETDATE() AS DATE))
GROUP BY B.Country,MONTH(A.Purchase_timestamp),YEAR(A.Purchase_timestamp)


--3
;WITH CTE AS 
(
SELECT DISTINCT [USER_ID]
FROM DBO.Purchases
WHERE CAST(Purchase_timestamp AS DATE)>=DATEADD(DAY,-30,CAST(GETDATE() AS DATE))
GROUP BY [User_Id]
HAVING SUM(PRICE)<2000
)
SELECT A.[USER_ID],COUNT(A.Event_code),CAST(A.Event_timestamp AS DATE)
FROM DBO.[Events] A INNER JOIN DBO.EventType B ON A.[Event_code]=B.[Eventtype]
WHERE A.[User_Id] IN (SELECT [User_Id] FROM CTE) AND
	  CAST(A.Event_timestamp AS DATE)>= DATEADD(DAY,-90,CAST(GETDATE() AS DATE)) AND
	  B.TypeDesc = 'Game_Played'
GROUP BY A.[User_Id],  CAST(A.Event_timestamp AS DATE)




--4
SELECT CAST(A.Purchase_timestamp AS DATE) AS PurchaseDate, B.[Type], count(*) as TotalPurchaseCount
FROM dbo.Purchases A 
                     inner join dbo.ItemPurchased B ON A.item_id=B.Id
					 inner join dbo.Items C ON B.[Type]=C.Item_id 
GROUP BY CAST(A.Purchase_timestamp AS DATE) 
HAVING COUNT(*)<10




--5
;WITH
 
  dates AS (
    SELECT DISTINCT CAST(Event_timestamp AS DATE) as [date],User_Id
    FROM dbo.[Events]
    
  ),


  groups AS (
    SELECT
      ROW_NUMBER() OVER (ORDER BY [date],[user_id]) AS rn,
      dateadd(day, -ROW_NUMBER() OVER (ORDER BY [date],[user_id]), [date]) AS grp,
      [date],[USER_ID]
    FROM dates
  )


SELECT
  COUNT(*) AS consecutiveDates,
  MIN(date) AS minDate,
  MAX(date) AS maxDate, User_Id
 into #Streak
FROM groups
GROUP BY grp,user_id
having COUNT(*)>=7
select count(*) as Streak from #streak




--6








--7






--8


SELECT [User_Id], CASE WHEN ISNULL(A.Number_chips_used,0) = 0 THEN 0
															  ELSE (ISNULL(A.Number_chips_won,0)/A.Number_chips_used)*100
															  END AS [RTP]
FROM DBO.[Events] A INNER JOIN DBO.EventType B ON A.Event_code=B.Eventtype
WHERE B.TypeDesc='Game_Played' AND A.Name='WinThemAll'
							   AND CAST(A.Event_timestamp AS DATE) >= DATEADD(DAY,-30,CAST(GETDATE() AS DATE))
GROUP BY [User_Id]


--9
/* 
HAVING clause can only be used with GROUP BY statement
In case there is a need of adding a condition on aggregation HAVING CLAUSE needs to be used. 
Example:
SELECT CAST(A.Purchase_timestamp AS DATE) AS PurchaseDate, B.[Type], count(*) as TotalPurchaseCount
FROM dbo.Purchases A 
                     inner join dbo.ItemPurchased B ON A.item_id=B.Id
					 inner join dbo.Items C ON B.[Type]=C.Item_id 
GROUP BY CAST(A.Purchase_timestamp AS DATE) 
HAVING COUNT(*)<10
*/


--10
/*
LEFT OUTER JOIN can be used when all the records from left table (first in left join condition) need to be extracted with matching values from right table. 
Left join will give everything from table A and matching values from table B (i.e. Select * From TableA AS X Left out join TableB AS Y ON X.ID=Y.ID)
Inner join will give matching values from table A and table B (i.e. Select * From TableA AS X INNER join TableB AS Y ON X.ID=Y.ID)


--11
/*
select foo from table1
inner join table2 on table1.a = table2.a and table1.b=table2.b;


select foo from table1
inner join table2 on table1.a = table2.a
where table1.b=table2.b;


Answer: The two (the inner joins) are exactly the same.  They will both return the same result.
*/


--12


/*
The two are different.  The one with the where clause is effectively converted into  an inner join.  To see why, there are two rules that are used.  First, the ON clause is completed and then the WHERE clause is done.
So suppose table has two rows


row1:  foo = 1, a = 2, b = 3
row2:  foo = 4, a = 5, b = 6


and table2 has one row


row1:  foo = 1, a = 2, b = 3


Then when you do


select foo from table1 
left outer join table2 on table1.a = table2.a and table1.b=table2.b;


the row with foo = 1 matches one row in table2 so it is returned and the row with foo = 4 does not match any rows in table2, so that row is returned with all the columns in table2 set to NULL.  Since the select only returns  the values of foo, you will get two rows, one with foo = 1 and one with foo = 4.


But when you do


select foo from table1
left outer join table2 on table1.a = table2.a
where table1.b=table2.b;


first the ON clause clause is done and that gets you two rows 


row1: foo = 1, table1.a = 2, table1.b = 3, table2.a = 2, table2.b = 3
row1: foo = 4, table1.a = 5, table1.b = 6, table2.a = NULL, table2.b = NULL


then the where clause is done.  That will return row1 (since the two values of b are equal), but it will not return row2 because the two values of b (5 and NULL) are not equal.  
So this query will return only one row row with foo = 1.




*/


10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

victorgomezz avatar image
victorgomezz answered

i am struggeling with #6 and #7 please assist

10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Jon Crawford avatar image
Jon Crawford answered

6 - you want to calculate their retention rate by aggregating the count of users meeting the definition of "played a game more than 7 days after they installed" over the count of total users, for EACH month in the last twelve months. I think you want to use a crosstab for this, see

https://www.databasejournal.com/features/mysql/article.php/3861886/Tips-for-Simplifying-Crosstab-Query-Statements.htm for a pretty good example that is a followup to an article that he previously wrote. Basically SUM(CASE WHEN something = true and DATEDIFF(mm,timestamp,today) = 12 THEN 1 ELSE 0 END) as '12 mos ago', rinse and repeat 12 times changing the value in DATEDIFF()

7 - basically the same thing, except you have to SUM() the revenue where the timestamp was within 7 days after the install date. Then do the crosstab, the last "blah blah blah" about the month it's counted in is just to say you can use the month portion of the install date without worrying about crossover (which is nice of them)

these are good questions, I like this!

10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Write an Answer

Hint: Notify or tag a user in this post by typing @username.

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.