Writing Stored Procedures And Functions

Temporal EDA, Variables & Date Manipulation

EDA: Exploratory Data analysis In this section, we will study date manipulation using different Date functions such as datepart, datename and datediff

-- Convert date time to date data type
select TOP(1) PickupDate, convert(Date, pickupDate) as DateOnly from YellowTripData

--here we use datepart to select the busiest hours for Taxi pickups
select top 3 count(Id) as BumverOfRides, Datepart(hour, pickupdate) as hour from ... group by datapart(hour, pickupdate) order by count(id) desc

--find the difference between the pickup date and dropoff date in seconds in sunday
select AVG(Datediff(second, pickupdate, dropoffdate) / 60)
as avgRideLenghtInmin
from ... where datename(weekday, pickupdate) = 'Sunday';

exercise: ransactions per day It's time for you to do some temporal EDA on the BikeShare dataset. Write a query to determine how many transactions exist per day.

SELECT
  -- Select the date portion of StartDate
  convert(date, StartDate) as StartDate,
  -- Measure how many records exist for each StartDate
  count(StartDate) as CountOfRows
FROM CapitalBikeShare
-- Group by the date portion of StartDate
group BY convert(date,  StartDate)
-- Sort the results by the date portion of StartDate
order BY convert(date, StartDate);

second exercise: seconds or no seconds? is the second datapart on startDate is equal to zero or not

third exercies: Which day of week is busiest?

fourth exercise: find the outliers what is outlier? In statistics, an outlier is a data point that differs significantly from other observations. An outlier may be due to variability in the measurement or it may indicate experimental error; the latter are sometimes excluded from the data set. An outlier can cause serious problems in statistical analyses. in the previous exercise 'Satureday' was the busiest day of the week, Do you wonder if there were any individual Saturday outliers that contributed to this

Variable for datetime data

User-defined Functions

Stored Procedures

NYC Taxi Ride Case Study

Variables for datetime date: we can create single value variable or table variable

CASTing: sql server use default rules when exeuting CAST() statements so you can cast a time to datetime, even whn there's no datetime present

Table Variables

table variables are not recommended to store more than one hundered recored beuase of potential performance

Date manipulation GETDATE DATEADD -> we don't have DATESUBTRACT because we can do the same functionality using negative number

note very important: the return value of DATEDIFF() depends on the datepart argument you pass

what does this query actually about? when starting to figure out complex query, start by the inner functions GETDATE() returns the current Date Second, we're calling DATEDIFF() to see how many weeks between today and 0 -> which is 1/1/1900 By adding zero weeks to the 6.217th week with the DATEADD(), sql server will return the date of the beginning of the week

Exercise Parameters matter with DATEDIFF How many times, in terms of days, weeks, and months, are the datepart boundaries crossed between the dates 2/26/2018 and 3/3/2018?

answer: 5 days, 0 weeks, 1 month. now you understand the assumption sql server take to evaluate datediff

find the fist day of the current month

User defined functions

Table Valued UDF

there are two types: 1- Inline table valued functions (ITVF)

2-Multi statement table function (MSTFV)

exercises

UDF in action: executing the UDF

the schema must be specified when creating and executing the function if we don't specifiy the schema upon creation, the sql server will by default assign to to the user defulat scehama

exec is used to execute the function and assign the function to the local variable called @TotalRideHrs

EXecuting scalar UDF

it's possibly to execute the function in the where clause, but this affect the query performance

since table valued function returns a table, we can execute it using the select from keywords

you can sepcify order by clause when executing the function, but it's not allowed n thr funciton itsslef

here we created a local table variable with the same structure as the function table result structure

exercises

Maintaing user defined Functions

change the function or modify using ALTER Function

Exercise

Oh crud

Oh EXEC

TRY CATCH

CASE STUDY

Formatting

Calculate Total Fare Amount per Total Distance for each day of week. If the TripDistance is zero use the Hot Deck imputation function you created earlier in the chapter.

Write a query to display the TotalDistance, TotalRideTime and TotalFare for each day and NYC Borough. Display the date, distance, ride time, and fare totals for German culture.

Last updated