温生 發表於 2025-10-24 10:35:59

SQL CTE (Common Table Expression) 高级用法与最佳实践

<div id="navCategory"><h5 class="catalogue">目录</h5><ul class="first_class_ul"><li><a href="#_label0">CTE (Common Table Expression) 详解</a></li><ul class="second_class_ul"><li><a href="#_lab2_0_0">基础概念</a></li><ul class="third_class_ul"><li><a href="#_label3_0_0_0">定义</a></li><li><a href="#_label3_0_0_1">本质特性</a></li><li><a href="#_label3_0_0_2">CTE 主要特点</a></li><li><a href="#_label3_0_0_3">基本语法结构</a></li></ul><li><a href="#_lab2_0_1">CTE类型详解</a></li><ul class="third_class_ul"><li><a href="#_label3_0_1_4">非递归CTE(普通CTE)</a></li><li><a href="#_label3_0_1_5">递归CTE</a></li></ul><li><a href="#_lab2_0_2">语法与执行机制</a></li><ul class="third_class_ul"><li><a href="#_label3_0_2_6">PostgreSQL CTE执行机制</a></li><li><a href="#_label3_0_2_7">递归CTE的终止机制</a></li></ul><li><a href="#_lab2_0_3">性能考虑与优化</a></li><ul class="third_class_ul"><li><a href="#_label3_0_3_8">CTE vs 子查询性能对比</a></li><li><a href="#_label3_0_3_9">内存使用优化</a></li></ul><li><a href="#_lab2_0_4">跨数据库支持</a></li><ul class="third_class_ul"><li><a href="#_label3_0_4_10">主流数据库CTE支持对比</a></li><li><a href="#_label3_0_4_11">数据库特定语法示例</a></li><li><a href="#_label3_0_4_12">兼容性处理策略</a></li></ul><li><a href="#_lab2_0_5">实际应用场景</a></li><ul class="third_class_ul"><li><a href="#_label3_0_5_13">1. 数据分析与报表</a></li><li><a href="#_label3_0_5_14">2. 层级数据处理</a></li><li><a href="#_label3_0_5_15">3. 时间序列数据处理</a></li></ul><li><a href="#_lab2_0_6">最佳实践</a></li><ul class="third_class_ul"><li><a href="#_label3_0_6_16">1. 命名规范</a></li><li><a href="#_label3_0_6_17">2. 结构化组织</a></li><li><a href="#_label3_0_6_18">3. 递归CTE最佳实践</a></li><li><a href="#_label3_0_6_19">4. 性能优化最佳实践</a></li></ul><li><a href="#_lab2_0_7">常见陷阱与注意事项</a></li><ul class="third_class_ul"><li><a href="#_label3_0_7_20">1. 递归CTE陷阱</a></li><li><a href="#_label3_0_7_21">2. 性能陷阱</a></li><li><a href="#_label3_0_7_22">3. 数据类型陷阱</a></li><li><a href="#_label3_0_7_23">4. NULL值处理</a></li></ul><li><a href="#_lab2_0_8">高级用法</a></li><ul class="third_class_ul"><li><a href="#_label3_0_8_24">1. CTE与窗口函数结合</a></li><li><a href="#_label3_0_8_25">2. 递归CTE生成复杂序列</a></li><li><a href="#_label3_0_8_26">3. CTE用于数据清洗和转换</a></li><li><a href="#_label3_0_8_27">4. 递归CTE处理图结构</a></li></ul><li><a href="#_lab2_0_9">总结</a></li><ul class="third_class_ul"><li><a href="#_label3_0_9_28">CTE的核心价值</a></li><li><a href="#_label3_0_9_29">选择CTE的时机</a></li><li><a href="#_label3_0_9_30">关键注意事项</a></li><li><a href="#_label3_0_9_31">最佳实践总结</a></li></ul></ul></ul></div><p class="maodian"><a name="_label0"></a></p><h2>CTE (Common Table Expression) 详解</h2>
<p class="maodian"><a name="_lab2_0_0"></a></p><h3>基础概念</h3>
<p class="maodian"><a name="_label3_0_0_0"></a></p><h4>定义</h4>
<p>CTE(Common Table Expression,公用表表达式)是SQL中的&quot;命名临时结果集&quot;,通过 <code>WITH</code> 关键字定义,仅在当前查询中生效。</p>
<p><strong>核心作用:</strong></p>
<ul><li><strong>简化复杂查询</strong>:将复杂逻辑分解为多个步骤</li><li><strong>提高可读性</strong>:使SQL语句更易理解和维护</li><li><strong>复用子查询结果</strong>:避免重复计算相同的子查询</li></ul>
<p class="maodian"><a name="_label3_0_0_1"></a></p><h4>本质特性</h4>
<ul><li><strong>非物理存储</strong>:不是物理表,不存储在磁盘上</li><li><strong>临时性</strong>:查询执行过程中生成的虚拟结果集</li><li><strong>作用域限制</strong>:仅在定义它的查询语句中有效</li><li><strong>自动销毁</strong>:查询结束后自动清理</li></ul>
<p class="maodian"><a name="_label3_0_0_2"></a></p><h4>CTE 主要特点</h4>
<ul><li>临时结果集:只在查询执行期间存在</li><li>可引用性:可以在主查询中多次引用</li><li>可读性强:比嵌套子查询更易理解</li><li>递归支持:支持递归查询(MySQL 8.0+)</li></ul>
<p class="maodian"><a name="_label3_0_0_3"></a></p><h4>基本语法结构</h4>
<div class="jb51code"><pre class="brush:sql;">WITH cte_name [(column_list)] AS (
    -- CTE定义查询
    SELECT ...
)
-- 主查询
SELECT ... FROM cte_name ...;
</pre></div>
<p class="maodian"><a name="_lab2_0_1"></a></p><h3>CTE类型详解</h3>
<p class="maodian"><a name="_label3_0_1_4"></a></p><h4>非递归CTE(普通CTE)</h4>
<h5>特点</h5>
<ul><li>使用 <code>WITH</code> 关键字</li><li>单一子查询,不引用自身</li><li>一次性执行,结果供主查询使用</li></ul>
<h5>基础示例</h5>
<div class="jb51code"><pre class="brush:sql;">-- 示例1:计算订单统计信息
WITH order_stats AS (
    SELECT
      AVG(amount) as avg_amount,
      MAX(amount) as max_amount,
      COUNT(*) as total_orders
    FROM orders
    WHERE order_date &gt;= '2024-01-01'
)
SELECT
    o.order_id,
    o.amount,
    os.avg_amount,
    CASE
      WHEN o.amount &gt; os.avg_amount THEN '高于平均'
      ELSE '低于平均'
    END as amount_category
FROM orders o
CROSS JOIN order_stats os
WHERE o.order_date &gt;= '2024-01-01';</pre></div>
<h5>多个CTE示例</h5>
<div class="jb51code"><pre class="brush:sql;">-- 示例2:多个CTE协同工作
WITH
high_value_customers AS (
    SELECT customer_id, SUM(amount) as total_spent
    FROM orders
    GROUP BY customer_id
    HAVING SUM(amount) &gt; 10000
),
recent_orders AS (
    SELECT customer_id, COUNT(*) as recent_order_count
    FROM orders
    WHERE order_date &gt;= CURRENT_DATE - INTERVAL '30 days'
    GROUP BY customer_id
)
SELECT
    c.customer_name,
    hvc.total_spent,
    COALESCE(ro.recent_order_count, 0) as recent_orders
FROM customers c
JOIN high_value_customers hvc ON c.customer_id = hvc.customer_id
LEFT JOIN recent_orders ro ON c.customer_id = ro.customer_id
ORDER BY hvc.total_spent DESC;</pre></div>
<p class="maodian"><a name="_label3_0_1_5"></a></p><h4>递归CTE</h4>
<h5>核心结构</h5>
<p>递归CTE必须包含两个部分:</p>
<ol><li><strong>锚点成员</strong>(Anchor Member):递归的起始点,非递归查询</li><li><strong>递归成员</strong>(Recursive Member):引用CTE自身的查询</li></ol>
<h5>执行逻辑</h5>
<ol><li>执行锚点成员,获得初始结果集</li><li>递归成员使用当前结果集查询新数据</li><li>将新结果添加到结果集中</li><li>重复步骤2-3,直到递归成员返回空结果</li><li>返回完整的结果集</li></ol>
<h5>基础递归示例</h5>
<div class="jb51code"><pre class="brush:sql;">-- 示例1:生成数字序列
WITH RECURSIVE number_series AS (
    -- 锚点成员:起始值
    SELECT 1 as n
    UNION ALL
    -- 递归成员:递增逻辑
    SELECT n + 1
    FROM number_series
    WHERE n &lt; 10-- 终止条件
)
SELECT * FROM number_series;</pre></div>
<h5>树形结构查询示例</h5>
<div class="jb51code"><pre class="brush:sql;">-- 示例2:组织架构查询(查找某员工及其所有下属)
WITH RECURSIVE employee_hierarchy AS (
    -- 锚点成员:指定的管理者
    SELECT
      employee_id,
      employee_name,
      manager_id,
      0 as level,
      CAST(employee_name AS VARCHAR(1000)) as path
    FROM employees
    WHERE employee_id = 1001-- 起始员工ID
    UNION ALL
    -- 递归成员:查找下属
    SELECT
      e.employee_id,
      e.employee_name,
      e.manager_id,
      eh.level + 1,
      CAST(eh.path || ' -&gt; ' || e.employee_name AS VARCHAR(1000))
    FROM employees e
    JOIN employee_hierarchy eh ON e.manager_id = eh.employee_id
    WHERE eh.level &lt; 5-- 防止无限递归
)
SELECT
    employee_id,
    employee_name,
    level,
    path as hierarchy_path
FROM employee_hierarchy
ORDER BY level, employee_name;</pre></div>
<h5>向上追溯示例</h5>
<div class="jb51code"><pre class="brush:sql;">-- 示例3:向上追溯管理链
WITH RECURSIVE management_chain AS (
    -- 锚点成员:指定员工
    SELECT
      employee_id,
      employee_name,
      manager_id,
      0 as level_up
    FROM employees
    WHERE employee_id = 2001-- 起始员工
    UNION ALL
    -- 递归成员:查找上级管理者
    SELECT
      e.employee_id,
      e.employee_name,
      e.manager_id,
      mc.level_up + 1
    FROM employees e
    JOIN management_chain mc ON e.employee_id = mc.manager_id
)
SELECT * FROM management_chain ORDER BY level_up;</pre></div>
<p class="maodian"><a name="_lab2_0_2"></a></p><h3>语法与执行机制</h3>
<p class="maodian"><a name="_label3_0_2_6"></a></p><h4>PostgreSQL CTE执行机制</h4>
<h5>物化控制</h5>
<p>PostgreSQL提供了对CTE物化的精确控制:</p>
<div class="jb51code"><pre class="brush:sql;">-- 强制物化(默认行为)
WITH cte_name AS MATERIALIZED (
    SELECT expensive_calculation() FROM large_table
)
SELECT * FROM cte_name
UNION ALL
SELECT * FROM cte_name;-- 复用已计算的结果
-- 禁止物化(内联优化)
WITH cte_name AS NOT MATERIALIZED (
    SELECT * FROM small_table WHERE condition
)
SELECT * FROM cte_name WHERE additional_condition;</pre></div>
<h5>执行计划分析</h5>
<div class="jb51code"><pre class="brush:sql;">-- 查看CTE执行计划
EXPLAIN (ANALYZE, BUFFERS)
WITH sales_summary AS (
    SELECT
      product_id,
      SUM(quantity) as total_quantity,
      SUM(amount) as total_amount
    FROM sales
    WHERE sale_date &gt;= '2024-01-01'
    GROUP BY product_id
)
SELECT
    p.product_name,
    ss.total_quantity,
    ss.total_amount
FROM products p
JOIN sales_summary ss ON p.product_id = ss.product_id;</pre></div>
<p class="maodian"><a name="_label3_0_2_7"></a></p><h4>递归CTE的终止机制</h4>
<h5>自动终止条件</h5>
<ul><li>递归成员返回空结果集</li><li>达到系统递归深度限制</li><li>满足用户定义的终止条件</li></ul>
<h5>防止无限递归的策略</h5>
<div class="jb51code"><pre class="brush:sql;">-- 策略1:使用计数器限制递归深度
WITH RECURSIVE limited_recursion AS (
    SELECT id, parent_id, name, 0 as depth
    FROM categories
    WHERE parent_id IS NULL
    UNION ALL
    SELECT c.id, c.parent_id, c.name, lr.depth + 1
    FROM categories c
    JOIN limited_recursion lr ON c.parent_id = lr.id
    WHERE lr.depth &lt; 10-- 限制最大深度
)
SELECT * FROM limited_recursion;
-- 策略2:使用路径检测避免循环
WITH RECURSIVE path_tracking AS (
    SELECT
      id,
      parent_id,
      name,
      ARRAY as path
    FROM categories
    WHERE parent_id IS NULL
    UNION ALL
    SELECT
      c.id,
      c.parent_id,
      c.name,
      pt.path || c.id
    FROM categories c
    JOIN path_tracking pt ON c.parent_id = pt.id
    WHERE NOT (c.id = ANY(pt.path))-- 避免循环
)
SELECT * FROM path_tracking;</pre></div>
<p class="maodian"><a name="_lab2_0_3"></a></p><h3>性能考虑与优化</h3>
<p class="maodian"><a name="_label3_0_3_8"></a></p><h4>CTE vs 子查询性能对比</h4>
<h5>何时使用CTE</h5>
<div class="jb51code"><pre class="brush:sql;">-- ✅ 推荐:需要多次引用相同结果时
WITH expensive_calc AS (
    SELECT
      customer_id,
      complex_calculation(data) as result
    FROM large_table
    WHERE complex_condition
)
SELECT c1.customer_id, c1.result, c2.result
FROM expensive_calc c1
JOIN expensive_calc c2 ON c1.customer_id = c2.customer_id + 1;
-- ❌ 不推荐:简单的一次性查询
SELECT * FROM (
    SELECT * FROM small_table WHERE simple_condition
) subquery;</pre></div>
<h5>性能优化技巧</h5>
<p><strong>1. 合理使用索引</strong></p>
<div class="jb51code"><pre class="brush:sql;">-- 确保递归CTE中的连接字段有索引
CREATE INDEX idx_categories_parent_id ON categories(parent_id);
CREATE INDEX idx_employees_manager_id ON employees(manager_id);
-- 在递归查询中使用索引友好的条件
WITH RECURSIVE category_tree AS (
    SELECT id, parent_id, name, 0 as level
    FROM categories
    WHERE id = 1-- 使用主键,利用主键索引
    UNION ALL
    SELECT c.id, c.parent_id, c.name, ct.level + 1
    FROM categories c
    JOIN category_tree ct ON c.parent_id = ct.id-- 利用外键索引
    WHERE ct.level &lt; 5
)
SELECT * FROM category_tree;</pre></div>
<p><strong>2. 控制递归深度</strong></p>
<div class="jb51code"><pre class="brush:sql;">-- 设置合理的递归深度限制
SET max_stack_depth = '2MB';-- PostgreSQL
-- 或在查询中使用WHERE条件限制深度
</pre></div>
<p><strong>3. 优化数据类型和字段选择</strong></p>
<div class="jb51code"><pre class="brush:sql;">-- ✅ 只选择必要的字段
WITH RECURSIVE slim_hierarchy AS (
    SELECT id, parent_id, level-- 只选择必要字段
    FROM categories
    WHERE parent_id IS NULL
    UNION ALL
    SELECT c.id, c.parent_id, sh.level + 1
    FROM categories c
    JOIN slim_hierarchy sh ON c.parent_id = sh.id
    WHERE sh.level &lt; 10
)
SELECT sh.id, sh.level, c.name-- 在最后再JOIN获取详细信息
FROM slim_hierarchy sh
JOIN categories c ON sh.id = c.id;</pre></div>
<p class="maodian"><a name="_label3_0_3_9"></a></p><h4>内存使用优化</h4>
<div class="jb51code"><pre class="brush:sql;">-- 大数据量递归查询的分批处理
WITH RECURSIVE batch_process AS (
    SELECT id, parent_id, name, 0 as level, 0 as batch_num
    FROM categories
    WHERE parent_id IS NULL
    UNION ALL
    SELECT c.id, c.parent_id, c.name, bp.level + 1,
         CASE WHEN bp.level % 1000 = 0 THEN bp.batch_num + 1
                ELSE bp.batch_num END
    FROM categories c
    JOIN batch_process bp ON c.parent_id = bp.id
    WHERE bp.level &lt; 10000 AND bp.batch_num &lt; 10
)
SELECT * FROM batch_process;</pre></div>
<p class="maodian"><a name="_lab2_0_4"></a></p><h3>跨数据库支持</h3>
<p class="maodian"><a name="_label3_0_4_10"></a></p><h4>主流数据库CTE支持对比</h4>
<table><thead><tr><th>数据库</th><th>非递归CTE</th><th>递归CTE</th><th>关键差异</th><th>版本要求</th></tr></thead><tbody><tr><td>PostgreSQL</td><td>✅</td><td>✅ (<code>WITH RECURSIVE</code>)</td><td>标准实现,支持物化控制</td><td>8.4+</td></tr><tr><td>MySQL</td><td>✅</td><td>✅ (<code>WITH RECURSIVE</code>)</td><td>8.0后支持,语法与PostgreSQL一致</td><td>8.0+</td></tr><tr><td>SQL Server</td><td>✅</td><td>✅ (<code>WITH</code>)</td><td>递归不需要RECURSIVE关键字</td><td>2005+</td></tr><tr><td>Oracle</td><td>✅</td><td>✅ (<code>WITH</code>)</td><td>支持子查询因子化</td><td>9i+</td></tr><tr><td>SQLite</td><td>✅</td><td>✅ (<code>WITH RECURSIVE</code>)</td><td>轻量实现</td><td>3.8.3+</td></tr></tbody></table>
<p class="maodian"><a name="_label3_0_4_11"></a></p><h4>数据库特定语法示例</h4>
<h5>SQL Server</h5>
<div class="jb51code"><pre class="brush:sql;">-- SQL Server递归CTE(无需RECURSIVE关键字)
WITH employee_cte AS (
    -- 锚点成员
    SELECT employee_id, manager_id, employee_name, 0 as level
    FROM employees
    WHERE manager_id IS NULL
    UNION ALL
    -- 递归成员
    SELECT e.employee_id, e.manager_id, e.employee_name, ec.level + 1
    FROM employees e
    INNER JOIN employee_cte ec ON e.manager_id = ec.employee_id
)
SELECT * FROM employee_cte
OPTION (MAXRECURSION 100);-- SQL Server特有的递归限制语法</pre></div>
<h5>Oracle</h5>
<div class="jb51code"><pre class="brush:sql;">-- Oracle的CTE(子查询因子化)
WITH
sales_data AS (
    SELECT product_id, SUM(amount) as total_sales
    FROM sales
    WHERE sale_date &gt;= DATE '2024-01-01'
    GROUP BY product_id
),
product_info AS (
    SELECT product_id, product_name, category_id
    FROM products
)
SELECT pi.product_name, sd.total_sales
FROM product_info pi
JOIN sales_data sd ON pi.product_id = sd.product_id
ORDER BY sd.total_sales DESC;</pre></div>
<h5>MySQL 8.0+</h5>
<div class="jb51code"><pre class="brush:sql;">-- MySQL递归CTE
WITH RECURSIVE fibonacci AS (
    SELECT 0 as n, 0 as fib_n, 1 as fib_n_plus_1
    UNION ALL
    SELECT n + 1, fib_n_plus_1, fib_n + fib_n_plus_1
    FROM fibonacci
    WHERE n &lt; 20
)
SELECT n, fib_n FROM fibonacci;
</pre></div>
<p class="maodian"><a name="_label3_0_4_12"></a></p><h4>兼容性处理策略</h4>
<h5>旧版本MySQL替代方案</h5>
<div class="jb51code"><pre class="brush:sql;">-- MySQL 5.x 使用临时表替代CTE
-- 替代普通CTE
CREATE TEMPORARY TABLE temp_order_stats AS
SELECT AVG(amount) as avg_amount FROM orders;
SELECT o.*, t.avg_amount
FROM orders o
CROSS JOIN temp_order_stats t
WHERE o.amount &gt; t.avg_amount;
DROP TEMPORARY TABLE temp_order_stats;
-- 替代递归CTE(使用存储过程)
DELIMITER //
CREATE PROCEDURE GetEmployeeHierarchy(IN root_id INT)
BEGIN
    CREATE TEMPORARY TABLE temp_hierarchy (
      employee_id INT,
      level INT
    );
    INSERT INTO temp_hierarchy VALUES (root_id, 0);
    SET @level = 0;
    WHILE ROW_COUNT() &gt; 0 AND @level &lt; 10 DO
      INSERT INTO temp_hierarchy
      SELECT e.employee_id, @level + 1
      FROM employees e
      JOIN temp_hierarchy th ON e.manager_id = th.employee_id
      WHERE th.level = @level;
      SET @level = @level + 1;
    END WHILE;
    SELECT * FROM temp_hierarchy;
    DROP TEMPORARY TABLE temp_hierarchy;
END //
DELIMITER ;</pre></div>
<p class="maodian"><a name="_lab2_0_5"></a></p><h3>实际应用场景</h3>
<p class="maodian"><a name="_label3_0_5_13"></a></p><h4>1. 数据分析与报表</h4>
<h5>销售漏斗分析</h5>
<div class="jb51code"><pre class="brush:sql;">WITH sales_funnel AS (
    SELECT
      'Leads' as stage,
      COUNT(*) as count,
      1 as stage_order
    FROM leads
    WHERE created_date &gt;= '2024-01-01'
    UNION ALL
    SELECT
      'Qualified Leads' as stage,
      COUNT(*) as count,
      2 as stage_order
    FROM leads
    WHERE status = 'qualified' AND created_date &gt;= '2024-01-01'
    UNION ALL
    SELECT
      'Opportunities' as stage,
      COUNT(*) as count,
      3 as stage_order
    FROM opportunities
    WHERE created_date &gt;= '2024-01-01'
    UNION ALL
    SELECT
      'Closed Won' as stage,
      COUNT(*) as count,
      4 as stage_order
    FROM opportunities
    WHERE status = 'won' AND created_date &gt;= '2024-01-01'
),
funnel_with_conversion AS (
    SELECT
      stage,
      count,
      stage_order,
      LAG(count) OVER (ORDER BY stage_order) as previous_count,
      CASE
            WHEN LAG(count) OVER (ORDER BY stage_order) &gt; 0
            THEN ROUND(count::DECIMAL / LAG(count) OVER (ORDER BY stage_order) * 100, 2)
            ELSE 100.0
      END as conversion_rate
    FROM sales_funnel
)
SELECT
    stage,
    count,
    conversion_rate || '%' as conversion_rate
FROM funnel_with_conversion
ORDER BY stage_order;</pre></div>
<h5>同期群分析(Cohort Analysis)</h5>
<div class="jb51code"><pre class="brush:sql;">WITH customer_cohorts AS (
    SELECT
      customer_id,
      DATE_TRUNC('month', MIN(order_date)) as cohort_month
    FROM orders
    GROUP BY customer_id
),
customer_activities AS (
    SELECT
      cc.cohort_month,
      DATE_TRUNC('month', o.order_date) as activity_month,
      COUNT(DISTINCT o.customer_id) as active_customers
    FROM customer_cohorts cc
    JOIN orders o ON cc.customer_id = o.customer_id
    GROUP BY cc.cohort_month, DATE_TRUNC('month', o.order_date)
),
cohort_table AS (
    SELECT
      cohort_month,
      activity_month,
      active_customers,
      EXTRACT(EPOCH FROM (activity_month - cohort_month)) / (30 * 24 * 60 * 60) as month_number
    FROM customer_activities
)
SELECT
    cohort_month,
    month_number,
    active_customers,
    FIRST_VALUE(active_customers) OVER (
      PARTITION BY cohort_month
      ORDER BY month_number
    ) as cohort_size,
    ROUND(
      active_customers::DECIMAL /
      FIRST_VALUE(active_customers) OVER (
            PARTITION BY cohort_month
            ORDER BY month_number
      ) * 100, 2
    ) as retention_rate
FROM cohort_table
ORDER BY cohort_month, month_number;</pre></div>
<p class="maodian"><a name="_label3_0_5_14"></a></p><h4>2. 层级数据处理</h4>
<h5>权限系统递归查询</h5>
<div class="jb51code"><pre class="brush:sql;">-- 查询用户的所有有效权限(包括继承的权限)
WITH RECURSIVE user_permissions AS (
    -- 直接权限
    SELECT
      up.user_id,
      up.permission_id,
      p.permission_name,
      'direct' as permission_source,
      0 as inheritance_level
    FROM user_permissions up
    JOIN permissions p ON up.permission_id = p.permission_id
    WHERE up.user_id = :user_id
    UNION ALL
    -- 角色继承的权限
    SELECT
      ur.user_id,
      rp.permission_id,
      p.permission_name,
      'role:' || r.role_name as permission_source,
      1 as inheritance_level
    FROM user_roles ur
    JOIN roles r ON ur.role_id = r.role_id
    JOIN role_permissions rp ON r.role_id = rp.role_id
    JOIN permissions p ON rp.permission_id = p.permission_id
    WHERE ur.user_id = :user_id
    UNION ALL
    -- 角色层级继承的权限
    SELECT
      up.user_id,
      rp.permission_id,
      p.permission_name,
      'inherited_role:' || pr.role_name as permission_source,
      up.inheritance_level + 1
    FROM user_permissions up
    JOIN user_roles ur ON up.user_id = ur.user_id
    JOIN role_hierarchy rh ON ur.role_id = rh.child_role_id
    JOIN roles pr ON rh.parent_role_id = pr.role_id
    JOIN role_permissions rp ON pr.role_id = rp.role_id
    JOIN permissions p ON rp.permission_id = p.permission_id
    WHERE up.inheritance_level &lt; 3-- 限制继承深度
)
SELECT DISTINCT
    permission_id,
    permission_name,
    MIN(inheritance_level) as min_inheritance_level,
    STRING_AGG(DISTINCT permission_source, ', ') as sources
FROM user_permissions
GROUP BY permission_id, permission_name
ORDER BY min_inheritance_level, permission_name;</pre></div>
<h5>分类目录管理</h5>
<div class="jb51code"><pre class="brush:sql;">-- 移动分类及其所有子分类到新的父分类下
WITH RECURSIVE category_subtree AS (
    -- 要移动的分类及其子分类
    SELECT id, parent_id, name, 0 as level
    FROM categories
    WHERE id = :category_to_move
    UNION ALL
    SELECT c.id, c.parent_id, c.name, cs.level + 1
    FROM categories c
    JOIN category_subtree cs ON c.parent_id = cs.id
),
update_plan AS (
    SELECT
      cs.id,
      CASE
            WHEN cs.level = 0 THEN :new_parent_id
            ELSE cs.parent_id
      END as new_parent_id
    FROM category_subtree cs
)
UPDATE categories
SET parent_id = up.new_parent_id,
    updated_at = CURRENT_TIMESTAMP
FROM update_plan up
WHERE categories.id = up.id;</pre></div>
<p class="maodian"><a name="_label3_0_5_15"></a></p><h4>3. 时间序列数据处理</h4>
<h5>生成时间序列并填充缺失数据</h5>
<div class="jb51code"><pre class="brush:sql;">WITH RECURSIVE date_series AS (
    SELECT DATE '2024-01-01' as date_val
    UNION ALL
    SELECT date_val + INTERVAL '1 day'
    FROM date_series
    WHERE date_val &lt; DATE '2024-12-31'
),
daily_sales AS (
    SELECT
      DATE(order_date) as sale_date,
      SUM(amount) as daily_amount,
      COUNT(*) as daily_orders
    FROM orders
    WHERE order_date &gt;= '2024-01-01'
      AND order_date &lt; '2025-01-01'
    GROUP BY DATE(order_date)
)
SELECT
    ds.date_val,
    COALESCE(dsales.daily_amount, 0) as amount,
    COALESCE(dsales.daily_orders, 0) as orders,
    -- 计算7天移动平均
    AVG(COALESCE(dsales.daily_amount, 0)) OVER (
      ORDER BY ds.date_val
      ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
    ) as moving_avg_7_days
FROM date_series ds
LEFT JOIN daily_sales dsales ON ds.date_val = dsales.sale_date
ORDER BY ds.date_val;</pre></div>
<h5>会话分析</h5>
<div class="jb51code"><pre class="brush:sql;">-- 分析用户会话,定义30分钟无活动为会话结束
WITH RECURSIVE user_sessions AS (
    SELECT
      user_id,
      event_time,
      event_type,
      ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY event_time) as rn,
      event_time as session_start,
      1 as session_id
    FROM user_events
    WHERE user_id = :user_id
      AND event_time &gt;= :start_date
    UNION ALL
    SELECT
      ue.user_id,
      ue.event_time,
      ue.event_type,
      us.rn + 1,
      CASE
            WHEN ue.event_time - us.event_time &gt; INTERVAL '30 minutes'
            THEN ue.event_time
            ELSE us.session_start
      END,
      CASE
            WHEN ue.event_time - us.event_time &gt; INTERVAL '30 minutes'
            THEN us.session_id + 1
            ELSE us.session_id
      END
    FROM user_events ue
    JOIN user_sessions us ON ue.user_id = us.user_id
                        AND ue.event_time &gt; us.event_time
    WHERE ue.user_id = :user_id
      AND ue.event_time &gt;= :start_date
      AND us.rn = (SELECT MAX(rn) FROM user_sessions WHERE user_id = us.user_id)
)
SELECT
    session_id,
    session_start,
    MAX(event_time) as session_end,
    COUNT(*) as event_count,
    MAX(event_time) - session_start as session_duration
FROM user_sessions
GROUP BY session_id, session_start
ORDER BY session_start;</pre></div>
<p class="maodian"><a name="_lab2_0_6"></a></p><h3>最佳实践</h3>
<p class="maodian"><a name="_label3_0_6_16"></a></p><h4>1. 命名规范</h4>
<div class="jb51code"><pre class="brush:sql;">-- ✅ 推荐:使用描述性的CTE名称
WITH
high_value_customers AS (...),
recent_orders AS (...),
product_performance AS (...)
-- ❌ 避免:使用模糊的名称
WITH
cte1 AS (...),
temp AS (...),
data AS (...)</pre></div>
<p class="maodian"><a name="_label3_0_6_17"></a></p><h4>2. 结构化组织</h4>
<div class="jb51code"><pre class="brush:sql;">-- ✅ 推荐:按逻辑顺序组织多个CTE
WITH
-- 基础数据提取
raw_sales_data AS (
    SELECT customer_id, product_id, amount, sale_date
    FROM sales
    WHERE sale_date &gt;= '2024-01-01'
),
-- 数据聚合
customer_totals AS (
    SELECT customer_id, SUM(amount) as total_spent
    FROM raw_sales_data
    GROUP BY customer_id
),
-- 分类标记
customer_segments AS (
    SELECT
      customer_id,
      total_spent,
      CASE
            WHEN total_spent &gt; 10000 THEN 'VIP'
            WHEN total_spent &gt; 5000 THEN 'Premium'
            ELSE 'Standard'
      END as segment
    FROM customer_totals
)
-- 最终查询
SELECT
    c.customer_name,
    cs.total_spent,
    cs.segment
FROM customers c
JOIN customer_segments cs ON c.customer_id = cs.customer_id
ORDER BY cs.total_spent DESC;</pre></div>
<p class="maodian"><a name="_label3_0_6_18"></a></p><h4>3. 递归CTE最佳实践</h4>
<h5>始终包含终止条件</h5>
<div class="jb51code"><pre class="brush:sql;">-- ✅ 推荐:明确的终止条件
WITH RECURSIVE hierarchy AS (
    SELECT id, parent_id, name, 0 as level
    FROM categories
    WHERE parent_id IS NULL
    UNION ALL
    SELECT c.id, c.parent_id, c.name, h.level + 1
    FROM categories c
    JOIN hierarchy h ON c.parent_id = h.id
    WHERE h.level &lt; 10-- 明确的深度限制
      AND c.parent_id IS NOT NULL-- 防止NULL值问题
)
SELECT * FROM hierarchy;</pre></div>
<h5>循环检测</h5>
<div class="jb51code"><pre class="brush:sql;">-- ✅ 推荐:检测和防止循环引用
WITH RECURSIVE safe_hierarchy AS (
    SELECT
      id,
      parent_id,
      name,
      0 as level,
      ARRAY as path
    FROM categories
    WHERE parent_id IS NULL
    UNION ALL
    SELECT
      c.id,
      c.parent_id,
      c.name,
      sh.level + 1,
      sh.path || c.id
    FROM categories c
    JOIN safe_hierarchy sh ON c.parent_id = sh.id
    WHERE sh.level &lt; 20
      AND NOT (c.id = ANY(sh.path))-- 防止循环
)
SELECT id, name, level, array_to_string(path, ' -&gt; ') as path
FROM safe_hierarchy;</pre></div>
<p class="maodian"><a name="_label3_0_6_19"></a></p><h4>4. 性能优化最佳实践</h4>
<h5>合理使用索引</h5>
<div class="jb51code"><pre class="brush:sql;">-- 为递归查询创建合适的索引
CREATE INDEX idx_categories_parent_id ON categories(parent_id);
CREATE INDEX idx_categories_id_parent_id ON categories(id, parent_id);
-- 复合索引用于复杂递归查询
CREATE INDEX idx_employees_manager_dept ON employees(manager_id, department_id);</pre></div>
<h5>限制结果集大小</h5>
<div class="jb51code"><pre class="brush:sql;">-- ✅ 推荐:在CTE中尽早过滤数据
WITH filtered_orders AS (
    SELECT customer_id, amount, order_date
    FROM orders
    WHERE order_date &gt;= '2024-01-01'-- 尽早过滤
      AND status = 'completed'
      AND amount &gt; 0
),
customer_stats AS (
    SELECT
      customer_id,
      COUNT(*) as order_count,
      SUM(amount) as total_amount
    FROM filtered_orders-- 使用已过滤的数据
    GROUP BY customer_id
)
SELECT * FROM customer_stats
WHERE order_count &gt;= 5;-- 进一步过滤</pre></div>
<p class="maodian"><a name="_lab2_0_7"></a></p><h3>常见陷阱与注意事项</h3>
<p class="maodian"><a name="_label3_0_7_20"></a></p><h4>1. 递归CTE陷阱</h4>
<h5>无限递归</h5>
<div class="jb51code"><pre class="brush:sql;">-- ❌ 危险:可能导致无限递归
WITH RECURSIVE dangerous_recursion AS (
    SELECT 1 as n
    UNION ALL
    SELECT n + 1 FROM dangerous_recursion-- 没有终止条件!
)
SELECT * FROM dangerous_recursion;
-- ✅ 安全:包含终止条件
WITH RECURSIVE safe_recursion AS (
    SELECT 1 as n
    UNION ALL
    SELECT n + 1 FROM safe_recursion WHERE n &lt; 100
)
SELECT * FROM safe_recursion;</pre></div>
<h5>循环引用数据</h5>
<div class="jb51code"><pre class="brush:sql;">-- 处理可能存在循环引用的数据
-- 假设categories表中存在循环引用:A -&gt; B -&gt; C -&gt; A
-- ❌ 问题:可能导致无限递归
WITH RECURSIVE bad_hierarchy AS (
    SELECT id, parent_id, name FROM categories WHERE id = 1
    UNION ALL
    SELECT c.id, c.parent_id, c.name
    FROM categories c
    JOIN bad_hierarchy bh ON c.parent_id = bh.id
)
SELECT * FROM bad_hierarchy;
-- ✅ 解决:使用路径跟踪防止循环
WITH RECURSIVE good_hierarchy AS (
    SELECT id, parent_id, name, ARRAY as path
    FROM categories WHERE id = 1
    UNION ALL
    SELECT c.id, c.parent_id, c.name, gh.path || c.id
    FROM categories c
    JOIN good_hierarchy gh ON c.parent_id = gh.id
    WHERE NOT (c.id = ANY(gh.path))
)
SELECT id, parent_id, name FROM good_hierarchy;</pre></div>
<p class="maodian"><a name="_label3_0_7_21"></a></p><h4>2. 性能陷阱</h4>
<h5>过度使用CTE</h5>
<div class="jb51code"><pre class="brush:sql;">-- ❌ 过度使用:简单查询不需要CTE
WITH simple_cte AS (
    SELECT * FROM users WHERE status = 'active'
)
SELECT * FROM simple_cte WHERE age &gt; 18;
-- ✅ 直接查询更简单高效
SELECT * FROM users
WHERE status = 'active' AND age &gt; 18;</pre></div>
<h5>大数据量递归</h5>
<div class="jb51code"><pre class="brush:sql;">-- ❌ 问题:大数据量递归可能导致内存溢出
WITH RECURSIVE large_hierarchy AS (
    SELECT id, parent_id, name FROM large_table WHERE parent_id IS NULL
    UNION ALL
    SELECT lt.id, lt.parent_id, lt.name
    FROM large_table lt
    JOIN large_hierarchy lh ON lt.parent_id = lh.id
)
SELECT * FROM large_hierarchy;
-- ✅ 解决:分批处理或限制深度
WITH RECURSIVE controlled_hierarchy AS (
    SELECT id, parent_id, name, 0 as level FROM large_table WHERE parent_id IS NULL
    UNION ALL
    SELECT lt.id, lt.parent_id, lt.name, ch.level + 1
    FROM large_table lt
    JOIN controlled_hierarchy ch ON lt.parent_id = ch.id
    WHERE ch.level &lt; 5-- 限制深度
)
SELECT * FROM controlled_hierarchy;</pre></div>
<p class="maodian"><a name="_label3_0_7_22"></a></p><h4>3. 数据类型陷阱</h4>
<h5>UNION ALL类型不匹配</h5>
<div class="jb51code"><pre class="brush:sql;">-- ❌ 问题:数据类型不匹配
WITH RECURSIVE type_mismatch AS (
    SELECT 1 as id, 'root' as name-- name是VARCHAR
    UNION ALL
    SELECT id + 1, id + 1 FROM type_mismatch WHERE id &lt; 5-- name变成了INTEGER
)
SELECT * FROM type_mismatch;
-- ✅ 解决:确保类型一致
WITH RECURSIVE type_consistent AS (
    SELECT 1 as id, 'root' as name
    UNION ALL
    SELECT id + 1, CAST(id + 1 AS VARCHAR) FROM type_consistent WHERE id &lt; 5
)
SELECT * FROM type_consistent;</pre></div>
<p class="maodian"><a name="_label3_0_7_23"></a></p><h4>4. NULL值处理</h4>
<div class="jb51code"><pre class="brush:sql;">-- ✅ 正确处理NULL值
WITH RECURSIVE null_safe_hierarchy AS (
    SELECT id, parent_id, name, 0 as level
    FROM categories
    WHERE parent_id IS NULL-- 明确处理NULL
    UNION ALL
    SELECT c.id, c.parent_id, c.name, nsh.level + 1
    FROM categories c
    JOIN null_safe_hierarchy nsh ON c.parent_id = nsh.id
    WHERE c.parent_id IS NOT NULL-- 防止NULL值问题
      AND nsh.level &lt; 10
)
SELECT * FROM null_safe_hierarchy;</pre></div>
<p class="maodian"><a name="_lab2_0_8"></a></p><h3>高级用法</h3>
<p class="maodian"><a name="_label3_0_8_24"></a></p><h4>1. CTE与窗口函数结合</h4>
<div class="jb51code"><pre class="brush:sql;">-- 计算每个产品的销售趋势
WITH monthly_sales AS (
    SELECT
      product_id,
      DATE_TRUNC('month', order_date) as month,
      SUM(amount) as monthly_amount
    FROM orders
    WHERE order_date &gt;= '2024-01-01'
    GROUP BY product_id, DATE_TRUNC('month', order_date)
),
sales_with_trends AS (
    SELECT
      product_id,
      month,
      monthly_amount,
      LAG(monthly_amount) OVER (PARTITION BY product_id ORDER BY month) as prev_month_amount,
      AVG(monthly_amount) OVER (
            PARTITION BY product_id
            ORDER BY month
            ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
      ) as moving_avg_3_months
    FROM monthly_sales
)
SELECT
    p.product_name,
    swt.month,
    swt.monthly_amount,
    swt.moving_avg_3_months,
    CASE
      WHEN swt.prev_month_amount IS NULL THEN 'N/A'
      WHEN swt.monthly_amount &gt; swt.prev_month_amount THEN 'Increasing'
      WHEN swt.monthly_amount &lt; swt.prev_month_amount THEN 'Decreasing'
      ELSE 'Stable'
    END as trend
FROM sales_with_trends swt
JOIN products p ON swt.product_id = p.product_id
ORDER BY p.product_name, swt.month;</pre></div>
<p class="maodian"><a name="_label3_0_8_25"></a></p><h4>2. 递归CTE生成复杂序列</h4>
<h5>生成斐波那契数列</h5>
<div class="jb51code"><pre class="brush:sql;">WITH RECURSIVE fibonacci AS (
    SELECT
      1 as n,
      0::BIGINT as fib_current,
      1::BIGINT as fib_next
    UNION ALL
    SELECT
      n + 1,
      fib_next,
      fib_current + fib_next
    FROM fibonacci
    WHERE n &lt; 50 AND fib_next &lt; 9223372036854775807-- 防止溢出
)
SELECT n, fib_current as fibonacci_number
FROM fibonacci;</pre></div>
<h5>生成工作日序列</h5>
<div class="jb51code"><pre class="brush:sql;">WITH RECURSIVE business_days AS (
    SELECT DATE '2024-01-01' as business_date
    WHERE EXTRACT(DOW FROM DATE '2024-01-01') BETWEEN 1 AND 5
    UNION ALL
    SELECT
      CASE
            WHEN EXTRACT(DOW FROM business_date + 1) = 6 THEN business_date + 3-- 跳过周末
            WHEN EXTRACT(DOW FROM business_date + 1) = 0 THEN business_date + 2
            ELSE business_date + 1
      END
    FROM business_days
    WHERE business_date &lt; DATE '2024-12-31'
),
business_days_with_holidays AS (
    SELECT bd.business_date
    FROM business_days bd
    LEFT JOIN holidays h ON bd.business_date = h.holiday_date
    WHERE h.holiday_date IS NULL-- 排除节假日
)
SELECT business_date FROM business_days_with_holidays ORDER BY business_date;</pre></div>
<p class="maodian"><a name="_label3_0_8_26"></a></p><h4>3. CTE用于数据清洗和转换</h4>
<div class="jb51code"><pre class="brush:sql;">-- 复杂的数据清洗流程
WITH
-- 第一步:基础数据清洗
cleaned_raw_data AS (
    SELECT
      customer_id,
      TRIM(UPPER(customer_name)) as customer_name,
      CASE
            WHEN email ~* '^+@+\.{2,}$'
            THEN LOWER(email)
            ELSE NULL
      END as email,
      CASE
            WHEN phone ~ '^\d{10,15}$' THEN phone
            ELSE REGEXP_REPLACE(phone, '[^\d]', '', 'g')
      END as phone
    FROM raw_customer_data
    WHERE customer_name IS NOT NULL
),
-- 第二步:去重处理
deduplicated_data AS (
    SELECT DISTINCT ON (customer_name, email)
      customer_id,
      customer_name,
      email,
      phone,
      ROW_NUMBER() OVER (PARTITION BY customer_name, email ORDER BY customer_id) as rn
    FROM cleaned_raw_data
    WHERE email IS NOT NULL
),
-- 第三步:数据验证
validated_data AS (
    SELECT
      customer_id,
      customer_name,
      email,
      phone,
      CASE
            WHEN LENGTH(customer_name) &lt; 2 THEN 'Invalid Name'
            WHEN email IS NULL THEN 'Invalid Email'
            WHEN LENGTH(phone) &lt; 10 THEN 'Invalid Phone'
            ELSE 'Valid'
      END as validation_status
    FROM deduplicated_data
    WHERE rn = 1
)
-- 最终结果
SELECT
    customer_id,
    customer_name,
    email,
    phone,
    validation_status
FROM validated_data
WHERE validation_status = 'Valid';</pre></div>
<p class="maodian"><a name="_label3_0_8_27"></a></p><h4>4. 递归CTE处理图结构</h4>
<h5>查找图中的所有路径</h5>
<div class="jb51code"><pre class="brush:sql;">-- 在有向图中查找从起点到终点的所有路径
WITH RECURSIVE all_paths AS (
    -- 起始节点
    SELECT
      start_node,
      end_node,
      ARRAY as path,
      1 as path_length
    FROM graph_edges
    WHERE start_node = :start_point
    UNION ALL
    -- 扩展路径
    SELECT
      ap.start_node,
      ge.end_node,
      ap.path || ge.end_node,
      ap.path_length + 1
    FROM all_paths ap
    JOIN graph_edges ge ON ap.end_node = ge.start_node
    WHERE NOT (ge.end_node = ANY(ap.path))-- 避免循环
      AND ap.path_length &lt; 10-- 限制路径长度
)
SELECT
    start_node,
    end_node,
    path,
    path_length
FROM all_paths
WHERE end_node = :end_point-- 过滤到目标节点的路径
ORDER BY path_length, path;</pre></div>
<p class="maodian"><a name="_lab2_0_9"></a></p><h3>总结</h3>
<p class="maodian"><a name="_label3_0_9_28"></a></p><h4>CTE的核心价值</h4>
<ol><li><strong>代码可读性</strong>:将复杂查询分解为逻辑清晰的步骤</li><li><strong>代码复用</strong>:在同一查询中多次引用相同的子查询结果</li><li><strong>递归处理</strong>:优雅处理层级和树形结构数据</li><li><strong>性能优化</strong>:通过物化避免重复计算</li></ol>
<p class="maodian"><a name="_label3_0_9_29"></a></p><h4>选择CTE的时机</h4>
<ul><li><strong>使用CTE</strong>:需要多次引用子查询结果、处理递归数据、提高复杂查询可读性</li><li><strong>避免CTE</strong>:简单的一次性查询、对性能要求极高的场景</li></ul>
<p class="maodian"><a name="_label3_0_9_30"></a></p><h4>关键注意事项</h4>
<ol><li><strong>递归终止</strong>:始终包含明确的终止条件</li><li><strong>循环检测</strong>:在可能存在循环的数据中使用路径跟踪</li><li><strong>性能监控</strong>:关注CTE的执行计划和资源使用</li><li><strong>类型一致</strong>:确保UNION ALL中的数据类型匹配</li><li><strong>索引优化</strong>:为递归查询的连接字段创建合适的索引</li></ol>
<p class="maodian"><a name="_label3_0_9_31"></a></p><h4>最佳实践总结</h4>
<ul><li>使用描述性的CTE名称</li><li>按逻辑顺序组织多个CTE</li><li>在CTE中尽早过滤数据</li><li>合理控制递归深度</li><li>正确处理NULL值</li><li>定期监控和优化性能</li></ul>
<p>CTE是SQL中强大而灵活的工具,掌握其正确使用方法能够显著提升SQL查询的质量和可维护性。在实际应用中,应根据具体场景选择合适的CTE类型,并遵循最佳实践以确保查询的正确性和性能。</p>
頁: [1]
查看完整版本: SQL CTE (Common Table Expression) 高级用法与最佳实践