PostgreSQL: Full-Text Search: Concept
Module: Database-Specific Features
PostgreSQL Full-Text Search is built-in search engine without external services. Key concepts: (1) tsvector: Searchable document format (normalized, stemmed, indexed). (2) tsquery: Search query format (supports AND/OR/NOT). (3) @@ operator: Matches tsvector against tsquery. (4) GIN indexes: Essential for performance (100x faster). (5) Ranking: ts_rank() orders results by relevance. (6) Language config: Stemming (running → run), stop words (the, a, an removed). Real-world: Medium uses PostgreSQL FTS for article search. Shopify uses it for product search. Trade-off: Simpler than Elasticsearch (no separate service) but less powerful (no fuzzy search, synonyms).
**1. Full-Text Search Basics - tsvector and tsquery:**
Full-text search finds documents matching search queries (like Google search).
**How it works:**
1. **tsvector**: Convert text to searchable format (normalized, stemmed)
2. **tsquery**: Create search query (keywords with operators)
3. **@@ operator**: Match tsvector against tsquery
4. **Ranking**: Order results by relevance
**tsvector - Searchable Document Format:**
```sql
-- Convert text to tsvector
SELECT to_tsvector('english', 'The quick brown fox jumps over the lazy dog');
-- Returns: 'brown':3 'dog':9 'fox':4 'jump':5 'lazi':8 'quick':2
-- Note: Stop words removed (the, over), stemmed (jumps -> jump, lazy -> lazi)
-- tsvector format: 'word':position
-- Position used for phrase search and ranking
-- Example with article
SELECT to_tsvector('english', 'PostgreSQL is a powerful database. PostgreSQL supports full-text search.');
-- Returns: 'databas':5 'full':7 'full-text':7,8 'postgresql':1,6 'power':4 'search':9 'support':6 'text':8
```
**tsquery - Search Query Format:**
```sql
-- Simple query
SELECT to_tsquery('english', 'postgresql');
-- Returns: 'postgresql'
-- AND operator (&)
SELECT to_tsquery('english', 'postgresql & search');
-- Returns: 'postgresql' & 'search'
-- Matches documents with both words
-- OR operator (|)
SELECT to_tsquery('english', 'postgresql | mysql');
-- Returns: 'postgresql' | 'mysql'
-- Matches documents with either word
-- NOT operator (!)
SELECT to_tsquery('english', 'postgresql & !mysql');
-- Returns: 'postgresql' & !'mysql'
-- Matches documents with postgresql but not mysql
-- Phrase search (<->)
SELECT to_tsquery('english', 'full <-> text <-> search');
-- Returns: 'full' <-> 'text' <-> 'search'