SQL Server Features Deep Dive: Interview
Module: Database-Specific Features
Explain the difference between RANK() and DENSE_RANK(). When would you use each?
RANK() and DENSE_RANK() both assign rankings but handle ties differently. RANK(): Leaves gaps after ties. Example: Scores 100, 90, 90, 80 get ranks 1, 2, 2, 4 (gap at 3). DENSE_RANK(): No gaps after ties. Example: Same scores get ranks 1, 2, 2, 3 (no gap). Use RANK() when: (1) Want to show position in ordered list (1st, 2nd, 4th), (2) Gaps are meaningful (show how many ahead). Use DENSE_RANK() when: (1) Want consecutive rankings (1, 2, 3), (2) Gaps not meaningful (just relative order). Real-world: Stack Overflow uses DENSE_RANK() for reputation leaderboard (consecutive ranks). Sports use RANK() for standings (gaps show position). Lesson: RANK() has gaps, DENSE_RANK() no gaps.
How do columnstore indexes provide 10x faster analytics? What are the trade-offs?
Columnstore indexes provide 10x faster analytics through three mechanisms: (1) Columnar storage: Stores data by column (not row), only reads needed columns. Example: SELECT SUM(total) reads only total column (not entire rows). (2) Compression: 5-10x compression (similar values in column compress well). (3) Batch mode execution: Processes 900 rows at once (not row-by-row). Result: 10x faster for aggregations (SUM, COUNT, AVG). Trade-offs: Columnstore: Fast for analytics (aggregations, scans), slow for single-row lookups, higher update overhead. Row-based: Fast for OLTP (single-row lookups, updates), slow for analytics. Use columnstore for: Data warehouses, analytics dashboards, reporting. Use row-based for: OLTP (transactional workloads). Real-world: Power BI uses columnstore for dashboards. Lesson: Columnstore for analytics, row-based for OLTP.
What are temporal tables and how do they work? When would you use them?
Temporal tables (system-versioned tables) automatically track history of changes. How they work: (1) Current table: Stores current data. (2) History table: Stores previous versions automatically. (3) System columns: valid_from, valid_to track time periods. (4) Automatic: Updates/deletes automatically move old version to history. Queries: FOR SYSTEM_TIME AS OF date (point-in-time), FOR SYSTEM_TIME ALL (all history). Use temporal tables when: (1) Audit trails (track who changed what when), (2) Point-in-time queries (data as of specific date), (3) Compliance (regulatory requirements for history). Benefits: Automatic (no triggers needed), efficient (optimized storage), simple queries (AS OF syntax). Real-world: Banking apps use temporal for account history. Healthcare systems use temporal for patient records. Lesson: Temporal tables for automatic history tracking.
Write a query to get top 10 products by revenue in each category using window functions.
-- Top 10 products by revenue in each category
WITH ranked_products AS (
SELECT
category,
product_id,
name,
SUM(quantity * price) AS revenue,
ROW_NUMBER() OVER (
PARTITION BY category
ORDER BY SUM(quantity * price) DESC
) AS rank_in_category
FROM sales s
JOIN products p ON s.product_id = p.product_id
GROUP BY category, product_id, name
)
SELECT
category,
product_id,
name,
revenue,
rank_in_category
FROM ranked_products
WHERE rank_in_category <= 10
ORDER BY category, rank_in_category;
-- Alternative: Using DENSE_RANK for ties
WITH ranked_products AS (
SELECT
category,
product_id,
name,
SUM(quantity * price) AS revenue,
DENSE_RANK() OVER (
PARTITION BY category