Cron Scheduling
Overview
A cron schedule is a time-based pattern that tells Precog when to load data from your sources into your destination. Precog uses Quartz cron expressions, a flexible format that lets you define exactly when and how often data loads occur — down to the second.
With cron scheduling, Precog can refresh your data automatically on a set schedule, so your destination always stays up to date without manual effort.
Why It Matters
Scheduling data loads is about balancing two priorities: performance and freshness. Frequent loads keep your data current but can consume more resources; infrequent loads save processing time but may delay updates.
Cron scheduling gives you precise control over when data moves. You can time loads to avoid peak business hours, align refreshes with reporting cycles, or run heavier tasks overnight. The result is a system that stays efficient and keeps your data ready when you need it.
How It Works in Precog
In Precog's new UI, each source can have its own schedule that determines when data is loaded into its destination.
Precog uses Quartz-style cron expressions, which include six or seven positions instead of the five used in standard UNIX cron. The additional position lets you schedule to the exact second, and an optional final position can restrict the schedule to a specific year.
A Quartz cron expression is written as:
Seconds Minutes Hours Day-of-Month Month Day-of-Week [Year (optional)]
To read one, simply move left to right:
-
The first number sets the second of the minute when the schedule should begin.
-
The second number defines the minute of the hour.
-
The third number sets the hour of the day (on a 24-hour clock).
-
The fourth position defines which day of the month to run on.
-
The fifth identifies the month.
-
The sixth identifies the day of the week.
-
The seventh (optional) limits the schedule to a specific year.
You'll often see a ? in either the day of month or day of week position. In Quartz, ? means "no specific value" and is required when you want to schedule based on one type of day but not the other. For example, if you are going to load data on the 10th of the month regardless of what day of the week it falls on, you would use "10" in the day-of-month field and "?" in the day-of-week field. You must use ? in one of these two fields — Quartz doesn't allow you to specify both a specific day of the month and a specific day of the week in the same expression.
Precog runs all schedules using UTC time by default.
Examples
Here are some examples that show how to read cron expressions in words:
-
0 0 2 * * ?→ Run every day at 2:00 AM (nightly load) -
0 0 0/4 * * ?→ Run every 4 hours throughout the day -
0 30 6 ? * MON→ Run every Monday at 6:30 AM (weekly refresh) -
0 0 0 1 * ?→ Run on the 1st day of every month at midnight (monthly batch)
If a load takes longer than the schedule interval, the next run waits until the current one completes. Adjusting schedules for heavier sources prevents overlap and keeps data moving smoothly.
Practical Advice
A little trial and error is normal when creating schedules. Cron syntax can take some experimenting — especially when deciding where to use the ? in Quartz expressions. Looking at working examples can help.
Watch for long-running sources. If a source takes longer to load than the schedule interval, later runs will wait until the current one finishes. For example, if a load is scheduled every 5 minutes but the process takes 8 minutes, the next run won't start until the previous one completes. This can delay when data becomes available in your destination.
To avoid delays, you can assign separate schedules:
-
Use a dedicated schedule for high-priority sources that need frequent updates.
-
Use a more relaxed schedule for less critical data that changes infrequently.
Having multiple schedules for different sources doesn't affect pricing and can help keep data flowing efficiently across your destinations.