feat: add Doubao (Ark) LLM provider by vegerot · Pull Request #209 · JerryZLiu/Dayflow

vegerot · 2026-02-17T03:58:13Z

Adds Volcengine Ark/Doubao as a selectable provider with onboarding + settings wiring and connection testing.

use Ark video understanding for screenshot transcription
Send a single timelapse MP4 via Ark chat-completions video_url instead of attaching per-frame image_url, and expand video timestamps back to real time.
refactor: combine video-related logic
Centralize transcript JSON decoding and transcript→observation conversion so Gemini and Doubao use the same validation/expansion logic.

Stack created with Sapling. Best reviewed with ReviewStack.

Copilot

Pull request overview

Adds Volcengine Ark / Doubao as a new selectable LLM provider and refactors shared video/transcript + prompt logic so Gemini and Doubao can share validation and prompt templates.

Changes:

Adds Doubao (Ark) provider selection + onboarding/setup flow and settings wiring, including a provider-specific connection test.
Introduces shared utilities for video timestamp parsing, transcript JSON decoding, and timeline validation; centralizes prompt templates and prompt override persistence.
Adds DoubaoArkProvider and integrates it into LLMService routing and analytics labeling.

Reviewed changes

Copilot reviewed 17 out of 17 changed files in this pull request and generated 7 comments.

Show a summary per file

File	Description
Dayflow/Dayflow/Views/UI/Settings/SettingsProvidersTabView.swift	Wires provider-specific connection test view for Gemini/Doubao.
Dayflow/Dayflow/Views/UI/Settings/ProvidersSettingsViewModel.swift	Adds Doubao provider configuration + metadata; switches prompt overrides to shared video prompt preferences.
Dayflow/Dayflow/Views/UI/RetryCoordinator.swift	Updates progress labels to “1/2”, “2/2”.
Dayflow/Dayflow/Views/Onboarding/TestConnectionView.swift	Generalizes connection testing across supported providers (Gemini + Doubao).
Dayflow/Dayflow/Views/Onboarding/OnboardingLLMSelectionView.swift	Adds Doubao provider card and makes card layout dynamic.
Dayflow/Dayflow/Views/Onboarding/LLMProviderSetupView.swift	Adds Doubao onboarding steps + base URL/model id inputs and persistence.
Dayflow/Dayflow/System/AnalyticsService.swift	Adds local logging alongside PostHog capture.
Dayflow/Dayflow/Core/Analysis/TimeParsing.swift	Adds shared LLM timestamp, transcript decode, and timeline validation utilities.
Dayflow/Dayflow/Core/AI/PromptPreferences.swift	New shared prompt overrides + templates for video providers.
Dayflow/Dayflow/Core/AI/LLMTypes.swift	Adds `.doubaoArk` provider type/id and Doubao preferences keys/defaults.
Dayflow/Dayflow/Core/AI/LLMService.swift	Instantiates and routes to `DoubaoArkProvider` for batch + text operations.
Dayflow/Dayflow/Core/AI/GeminiPromptPreferences.swift	Updates prompt example formatting (colon after time range).
Dayflow/Dayflow/Core/AI/GeminiDirectProvider.swift	Uses shared prompt templates and shared transcript/timeline validation utilities.
Dayflow/Dayflow/Core/AI/DoubaoArkProvider.swift	Implements Ark OpenAI-compatible chat-completions provider with video_url-based transcription and card generation.
Dayflow/Dayflow/Core/AI/ChatCLIPromptPreferences.swift	Updates prompt example formatting (colon after time range).
Dayflow/Dayflow/App/AppDelegate.swift	Adds Doubao provider label to analytics event payload.
Dayflow/Dayflow.xcodeproj/project.pbxproj	Updates code signing team setting.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-04T22:31:31Z

+        if let apiKey = KeychainManager.shared.retrieve(for: "doubao"), !apiKey.isEmpty {
+            return DoubaoArkProvider(apiKey: apiKey, endpoint: endpoint, modelId: modelId)


makeDoubaoProvider only checks !apiKey.isEmpty without trimming whitespace/newlines. Elsewhere (setup + settings) keys are trimmed before validation. Consider trimming here too so a whitespace-only key doesn’t create a provider that will fail every request.

Suggested change

if let apiKey = KeychainManager.shared.retrieve(for: "doubao"), !apiKey.isEmpty {

return DoubaoArkProvider(apiKey: apiKey, endpoint: endpoint, modelId: modelId)

if let rawApiKey = KeychainManager.shared.retrieve(for: "doubao") {

let apiKey = rawApiKey.trimmingCharacters(in: .whitespacesAndNewlines)

if !apiKey.isEmpty {

return DoubaoArkProvider(apiKey: apiKey, endpoint: endpoint, modelId: modelId)

}

Copilot · 2026-03-04T22:31:31Z

+enum GeminiPromptDefaults {
+    static let titleBlock = """
+Titles
+Each title is a memory trigger. Be specific enough that it could only describe one situation.
+"Bug fixes" could be anything. "Fixed the infinite scroll crash on search results" can only be one thing.
+"Gaming session" could be any day. "League ARAM — Thresh and Jinx" is a specific session.
+
+Use honest verbs
+The verb matters. Pick the one that describes what actually happened, not the one that sounds most professional.
+If someone was browsing a product page and picking options, they were "speccing out" a purchase — not "configuring" it (that implies they already own it). If someone scheduled a meeting, they "scheduled" it — not "coordinated" it. If someone was scrolling a feed, they were "scrolling" — not "catching up on industry news."
+The wrong verb changes the memory. Get it right even if it sounds less impressive.
+
+Accuracy over polish
+Don't compress what happened into a technical-sounding phrase that loses the meaning. If the actual bug was "the notification wasn't showing up after regeneration," say that — don't abstract it into "verification pipeline error" because it sounds more engineered.
+The title's job is to be TRUE and SPECIFIC, not to sound smart. When in doubt, describe the actual problem or action in plain language.
+
+Titles can be longer
+A title that's a few words longer but triggers a real memory beats a short vague one every time. Don't trim useful detail for brevity. Aim for roughly 5–15 words — but if word 12 is the one that makes you remember, keep it.
+
+Banned words
+These are corporate filler. No human writes them in a journal:
+"research," "coordination," "management," "administration," "workflow," "sync," "alignment," "exploration," "investigation," "project development," "social chat," "various," "multiple," "several," "deep dive," "rabbit hole"
+Don't just avoid these exact words — avoid the energy. "Analyzing" is just "research" in a lab coat. "Refining" is just "working on" trying to sound important. "Coordinated" is "scheduled" wearing a tie. If you wouldn't say it out loud to a friend, it's too formal.
+


GeminiPromptDefaults is declared here but also already exists in Core/AI/GeminiPromptPreferences.swift, which will cause a duplicate symbol / ambiguous type compile error when using file-system–synchronized groups (both files are built). Consider moving the defaults to a single source of truth (e.g., keep GeminiPromptDefaults only in one file) and updating references accordingly, or deleting/renaming the older definitions if this file replaces them.

Suggested change

enum GeminiPromptDefaults {

static let titleBlock = """

Titles

Each title is a memory trigger. Be specific enough that it could only describe one situation.

"Bug fixes" could be anything. "Fixed the infinite scroll crash on search results" can only be one thing.

"Gaming session" could be any day. "League ARAM — Thresh and Jinx" is a specific session.

Use honest verbs

The verb matters. Pick the one that describes what actually happened, not the one that sounds most professional.

If someone was browsing a product page and picking options, they were "speccing out" a purchase — not "configuring" it (that implies they already own it). If someone scheduled a meeting, they "scheduled" it — not "coordinated" it. If someone was scrolling a feed, they were "scrolling" — not "catching up on industry news."

The wrong verb changes the memory. Get it right even if it sounds less impressive.

Accuracy over polish

Don't compress what happened into a technical-sounding phrase that loses the meaning. If the actual bug was "the notification wasn't showing up after regeneration," say that — don't abstract it into "verification pipeline error" because it sounds more engineered.

The title's job is to be TRUE and SPECIFIC, not to sound smart. When in doubt, describe the actual problem or action in plain language.

Titles can be longer

A title that's a few words longer but triggers a real memory beats a short vague one every time. Don't trim useful detail for brevity. Aim for roughly 5–15 words — but if word 12 is the one that makes you remember, keep it.

Banned words

These are corporate filler. No human writes them in a journal:

"research," "coordination," "management," "administration," "workflow," "sync," "alignment," "exploration," "investigation," "project development," "social chat," "various," "multiple," "several," "deep dive," "rabbit hole"

Don't just avoid these exact words — avoid the energy. "Analyzing" is just "research" in a lab coat. "Refining" is just "working on" trying to sound important. "Coordinated" is "scheduled" wearing a tie. If you wouldn't say it out loud to a friend, it's too formal.

// NOTE: GeminiPromptDefaults is defined in Core/AI/GeminiPromptPreferences.swift

// to provide a single source of truth for these defaults and avoid duplicate symbols.

Copilot · 2026-03-04T22:31:31Z

 Bad example:
 "7:00 AM - 7:30 AM writing Notion doc
-7:30 AM - 7:35 AM Slack
+7:30 AM - 7:35: AM Slack


The example timestamp format has a typo: 7:30 AM - 7:35: AM Slack includes an extra colon after 7:35, which breaks the intended [H:MM AM/PM] - [H:MM AM/PM]: ... pattern.

Suggested change

7:30 AM - 7:35: AM Slack

7:30 AM - 7:35 AM Slack

Copilot · 2026-03-04T22:31:32Z

 				CURRENT_PROJECT_VERSION = 95;
 				DEVELOPMENT_ASSET_PATHS = "\"Dayflow/Preview Content\"";
-				DEVELOPMENT_TEAM = L75WYD8X4Y;
+				DEVELOPMENT_TEAM = GWKJRP4KM9;


This change updates DEVELOPMENT_TEAM in the project file. That setting is usually machine/account-specific and will break code signing for other contributors/CI unless intentionally coordinated. Consider reverting this change or moving it to a separate, explicitly-scoped signing/CI PR.

Suggested change

DEVELOPMENT_TEAM = GWKJRP4KM9;

Copilot · 2026-03-04T22:31:32Z

 /// Cached DateFormatters for time parsing - creating DateFormatters is expensive (ICU initialization)
 private let cachedHMMAFormatters: [DateFormatter] = {
-    let formats = [
-        "h:mma",   // 09:30AM, 9:30AM
-        "hh:mma",  // 09:30AM
-        "h:mm a",  // 09:30 AM, 9:30 AM
-        "hh:mm a"  // 09:30 AM
-    ]
-    return formats.map { format in
-        let formatter = DateFormatter()
-        formatter.locale = Locale(identifier: "en_US_POSIX") // Essential for AM/PM parsing
-        formatter.dateFormat = format
-        return formatter
-    }
+  let formats = [
+    "h:mma",  // 09:30AM, 9:30AM
+    "hh:mma",  // 09:30AM
+    "h:mm a",  // 09:30 AM, 9:30 AM
+    "hh:mm a",  // 09:30 AM
+  ]
+  return formats.map { format in
+    let formatter = DateFormatter()
+    formatter.locale = Locale(identifier: "en_US_POSIX")  // Essential for AM/PM parsing
+    formatter.dateFormat = format
+    return formatter
+  }


cachedHMMAFormatters stores shared DateFormatter instances, but DateFormatter is not thread-safe and parseTimeHMMA is called from nonisolated/background contexts (e.g., timeline validation during LLM work). This can lead to data races/crashes. Consider either creating a fresh formatter per call, or protecting shared formatters behind a lock/queue, or switching to a thread-safe parsing strategy.

Copilot · 2026-03-04T22:31:32Z

+/// Shared utilities for decoding model-generated transcript JSON and converting it into Dayflow observations.
+///
+/// Providers differ in transport (Gemini schema-constrained vs Ark chat-completions), but the transcript payload
+/// is intentionally normalized to the same `[{ startTimestamp, endTimestamp, description }]` array.
+enum LLMTranscriptUtilities {
+  struct VideoTranscriptChunk: Codable {
+    let startTimestamp: String
+    let endTimestamp: String
+    let description: String


New transcript/timeline utilities are introduced here, but the repo already has unit tests for TimeParsing (TimeParsingTests.swift). Consider adding targeted tests for the new JSON decoding + bracket-fallback extraction and timestamp expansion/filtering behavior to prevent regressions.

Copilot · 2026-03-04T22:31:33Z

+        let json = jsonString(properties)
+        let line = truncate("[Analytics] \(event) \(json)")
+        print(line)
+        localLogger.info("\(line, privacy: .public)")


logLocal prints every analytics event to stdout and logs it with privacy: .public. This can be very noisy in production and risks leaking sensitive/PII if it slips past sanitize. Consider gating local logging behind #if DEBUG (or a dedicated runtime flag) and using .private (or omitting properties) for OSLog privacy.

Suggested change

let json = jsonString(properties)

let line = truncate("[Analytics] \(event) \(json)")

print(line)

localLogger.info("\(line, privacy: .public)")

#if DEBUG

let json = jsonString(properties)

let line = truncate("[Analytics] \(event) \(json)")

print(line)

localLogger.info("\(line, privacy: .private)")

#else

_ = event

_ = properties

#endif

JerryZLiu · 2026-03-04T23:41:47Z

unfortunately i don't think it makes sense to merge this for now - my current bar is that anything merged should be useful to at least 10% of users and i don't think doubao is very popular right now. could change in the future though! obviously feel free to use it in your own fork though!

Ensures every AnalyticsService event is sent to PostHog, printed to stdout, and emitted via Apple Unified Logging.

Adds Volcengine Ark/Doubao as a selectable provider with onboarding + settings wiring and connection testing. * use Ark video understanding for screenshot transcription * Send a single timelapse MP4 via Ark chat-completions `video_url` instead of attaching per-frame `image_url`, and expand video timestamps back to real time. * refactor: combine video-related logic * Centralize transcript JSON decoding and transcript→observation conversion so Gemini and Doubao use the same validation/expansion logic.

vegerot · 2026-03-05T02:32:30Z

unfortunately i don't think it makes sense to merge this for now - my current bar is that anything merged should be useful to at least 10% of users and i don't think doubao is very popular right now. could change in the future though! obviously feel free to use it in your own fork though!

@JerryZLiu Makes total sense! I split this patch into 2:

refactor: centralize LLM prompt templates and transcript utilities #226 - making it easier to add video-understanding LLMs in the future
Add Doubao support, on my fork

Could we please land #226? I think it's an improvement for this project as well as making my fork easier to maintain.

vegerot mentioned this pull request Feb 17, 2026

feat: improve date format #208

Merged

vegerot force-pushed the pr209 branch 3 times, most recently from d24845a to af0c6fc Compare February 20, 2026 21:41

vegerot mentioned this pull request Feb 20, 2026

feat: add Doubao (Ark) LLM provider #212

Open

vegerot force-pushed the pr209 branch 2 times, most recently from 0d96ce6 to aa1f7f8 Compare February 24, 2026 19:05

vegerot force-pushed the pr209 branch from aa1f7f8 to f0a0d0e Compare March 4, 2026 22:24

Copilot AI review requested due to automatic review settings March 4, 2026 22:24

This was referenced Mar 4, 2026

feat: mirror analytics events to stdout and OSLog #225

Closed

better progress message on cards #224

Merged

Copilot started reviewing on behalf of vegerot March 4, 2026 22:24 View session

Copilot AI reviewed Mar 4, 2026

View reviewed changes

feat: mirror analytics events to stdout and OSLog

16e3cdf

Ensures every AnalyticsService event is sent to PostHog, printed to stdout, and emitted via Apple Unified Logging.

vegerot force-pushed the pr209 branch from f0a0d0e to 1714d32 Compare March 5, 2026 00:34

vegerot force-pushed the pr209 branch from 1714d32 to 9300a85 Compare March 5, 2026 00:35

vegerot closed this Mar 9, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add Doubao (Ark) LLM provider#209

feat: add Doubao (Ark) LLM provider#209
vegerot wants to merge 2 commits intoJerryZLiu:mainfrom
vegerot:pr209

vegerot commented Feb 17, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 4, 2026

Uh oh!

Copilot AI Mar 4, 2026

Uh oh!

Copilot AI Mar 4, 2026

Uh oh!

Copilot AI Mar 4, 2026

Uh oh!

Copilot AI Mar 4, 2026

Uh oh!

Copilot AI Mar 4, 2026

Uh oh!

Copilot AI Mar 4, 2026

Uh oh!

JerryZLiu commented Mar 4, 2026

Uh oh!

vegerot commented Mar 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		if let apiKey = KeychainManager.shared.retrieve(for: "doubao"), !apiKey.isEmpty {
		return DoubaoArkProvider(apiKey: apiKey, endpoint: endpoint, modelId: modelId)

Conversation

vegerot commented Feb 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

JerryZLiu commented Mar 4, 2026

Uh oh!

vegerot commented Mar 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

vegerot commented Feb 17, 2026 •

edited

Loading