Automation Architecture
Multi-Tenant Architecture Overview
Looking at our current database schema, we have a solid foundation:
user_profiles → farms → rows → racks → shelves → schedules
↓
device_assignments → scheduled_actions
Each user is isolated at the farm level, ensuring complete independence between users.
The Reliability Implementation
1. Scheduled Action Generation
When a user starts a new grow schedule, the system automatically generates all required actions for the entire grow cycle:
-- Example: User A starts lettuce (30-day cycle, water every 4 hours, 16h light cycle)
-- System creates ~360 scheduled actions (180 watering + 60 light on/off + nutrients, etc.)
2. Cron-Based Execution Engine
We'd implement a dedicated Edge Function that runs every minute:
// supabase/functions/scheduled-action-processor/index.ts
export default async function(req: Request) {
// Get all farms with pending actions in next 60 seconds
const farms = await getFarmsWithPendingActions();
// Process each farm independently (parallel execution)
await Promise.allSettled(
farms.map(farm => processFarmActions(farm.id))
);
}
3. Farm-Isolated Processing
Each farm's actions are processed in complete isolation:
-- Process only User A's farm actions
SELECT sa.* FROM scheduled_actions sa
JOIN device_assignments da ON sa.device_assignment_id = da.id
JOIN farms f ON (da.farm_id = f.id OR da.row_id IN (SELECT id FROM rows WHERE farm_id = f.id))
WHERE f.id = 'user-a-farm-id'
AND sa.execution_time <= NOW() + INTERVAL '1 minute'
AND sa.status = 'pending';
4. Failure Isolation & Recovery
If User A's Home Assistant is offline, it doesn't affect Users B or C:
async function processFarmActions(farmId: string) {
try {
const actions = await getPendingActions(farmId);
for (const action of actions) {
try {
await executeAction(action);
await markActionExecuted(action.id);
} catch (error) {
await handleActionFailure(action.id, error);
// Retry logic: 1min, 5min, 15min, then mark failed
}
}
} catch (farmError) {
// Farm-level error doesn't affect other farms
await logFarmError(farmId, farmError);
}
}
Real-World Example Scenario
Let's say at 8:00 AM:
- User A (Lettuce Farm): 3 shelves need watering, 2 need lights on
- User B (Tomato Farm): 1 shelf needs nutrient dosing, 4 need lights on
- User C (Herb Farm): 2 shelves need watering, 1 needs fan activation
Our system processes these concurrently and independently:
// 8:00 AM execution
await Promise.allSettled([
processFarmActions('user-a-farm'), // 5 actions
processFarmActions('user-b-farm'), // 5 actions
processFarmActions('user-c-farm'), // 3 actions
]);
// Even if User B's Home Assistant is down,
// Users A and C still get their automation
Reliability Guarantees
Database-Level Reliability
-- Atomic action execution with logging
CREATE OR REPLACE FUNCTION execute_scheduled_action(action_id UUID)
RETURNS BOOLEAN AS $$
BEGIN
-- Update status to 'executing' to prevent duplicates
UPDATE scheduled_actions
SET status = 'executing', executed_at = NOW()
WHERE id = action_id AND status = 'pending';
-- If no rows updated, action already processed
IF NOT FOUND THEN
RETURN FALSE;
END IF;
-- Execute the actual device control
-- (This calls our immediate device control functions)
RETURN TRUE;
END;
$$ LANGUAGE plpgsql;
Retry Logic with Exponential Backoff
const retrySchedule = [
1 * 60 * 1000, // 1 minute
5 * 60 * 1000, // 5 minutes
15 * 60 * 1000, // 15 minutes
60 * 60 * 1000, // 1 hour
];
async function handleActionFailure(actionId: string, error: Error) {
const action = await getAction(actionId);
const retryCount = action.retry_count || 0;
if (retryCount < retrySchedule.length) {
// Schedule retry
await scheduleRetry(actionId, retrySchedule[retryCount]);
} else {
// Mark as failed, alert farm manager
await markActionFailed(actionId, error.message);
await alertFarmManager(action.farm_id, actionId);
}
}
Circuit Breaker for Problematic Devices
-- Temporarily disable devices with repeated failures
CREATE OR REPLACE FUNCTION check_device_health(device_id UUID)
RETURNS BOOLEAN AS $$
DECLARE
failure_count INTEGER;
BEGIN
-- Count failures in last hour
SELECT COUNT(*) INTO failure_count
FROM scheduled_actions sa
WHERE sa.device_assignment_id = device_id
AND sa.status = 'failed'
AND sa.executed_at > NOW() - INTERVAL '1 hour';
-- If 3+ failures, temporarily disable
IF failure_count >= 3 THEN
UPDATE device_assignments
SET status = 'temporarily_disabled'
WHERE id = device_id;
RETURN FALSE;
END IF;
RETURN TRUE;
END;
$$ LANGUAGE plpgsql;
Monitoring Dashboard
Farm managers see real-time automation status:
- ✅ Shelf A-1: Watered at 8:00 AM (Success)
- ✅ Shelf A-2: Lights on at 8:00 AM (Success)
- ❌ Shelf A-3: Watering failed - pump offline (Retrying in 5 min)
- ⏳ Shelf A-4: Nutrient dose scheduled for 8:30 AM
System administrators see health across all 50 farms: - Farm Success Rate: 98.5% (last 24h) - Active Farms: 47/50 (3 have temporary issues) - Actions Processed: 2,847 today - Failed Actions: 23 (all retrying or resolved)
Scalability Benefits
This architecture scales beautifully because:
- Parallel Processing: All 50 farms process simultaneously
- Database Efficiency: Indexed queries by farm and execution time
- Resource Isolation: One farm's issues don't consume resources from others
- Horizontal Scaling: Can easily handle 500+ farms with the same pattern
The result is a system where each user gets reliable, independent automation regardless of what happens with other users' farms. Your lettuce gets watered on schedule even if someone else's tomato farm has connectivity issues!