Hello,
I've been very familiar with the UVC Support in iPadOS ever since it launched in iOS 17. There are a number of people that use the software I've developed built around UVC and there are often queries about 8-Bit vs. 10-Bit.
My understanding is that the newest UVC Spec is 1.5 which was standardised in 2012 and almost every UVC Capture Card runs at 8-Bit.
The only 10-Bit Capture Card that is on my radar is the AJA U-Tap SDI, however it looks like this is 10-Bit up until the UVC Part where the 10-Bit Input is downsampled to 8-Bit. Though I have read in certain places that it works as a 10-Bit Capture Card on macOS but not on iPadOS.
I was just wondering if 10-Bit via UVC is even possible on iPadOS? If there was indeed a true 10-Bit Source being passed into an iPad, would iPadOS allow it or would it be downsampled by AVFoundation so it can show up as a valid external video input?
All USB Capture Cards that I have encountered use one of the following formats:
kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange
kCVPixelFormatType_420YpCbCr8BiPlanarFullRange
kCVPixelFormatType_32BGRA
So if a UVC Device delivered a 10-Bit Format, would that be accessible by iPadOS or would it fallback to these 8-Bit Formats by default?
Thanks!
AVFoundation
RSS for tagWork with audiovisual assets, control device cameras, process audio, and configure system audio interactions using AVFoundation.
Posts under AVFoundation tag
200 Posts
Selecting any option will automatically load the page
Post
Replies
Boosts
Views
Activity
Hello everyone,
I’m looking for more detailed information regarding UVC (USB Video Class) over MFi within the Apple ecosystem and would appreciate some clarification.
I’m interested in developing (or interfacing with) an accessory that transmits video over USB using the UVC standard, and I’d like to better understand how this works within the MFi (Made for iPhone) program.
Here are my main questions:
1. Do iOS devices provide native support for UVC over USB-C or Lightning within the MFi framework?
2. Are there any specific firmware or authentication requirements when the accessory is MFi-certified?
3. Does UVC support depend solely on the hardware interface (USB-C vs Lightning), or are there additional software-level requirements?
4. Is there any official documentation outlining the recommended flow for implementing UVC-based video capture accessories on iOS?
From what I understand, USB-C iPads appear to offer more direct support for standard UVC devices, but it’s not entirely clear how this integrates with the MFi ecosystem with iOS, especially for commercial product development.
If anyone has gone through this process or can point me to relevant technical documentation, I would greatly appreciate the guidance.
Thank you!
Hi,
I’m trying to better understand how AVAssetDownloadConfiguration selects video variants when downloading HLS content for offline playback.
Suppose I have an HLS master playlist (.m3u8) that contains several video variants defined with #EXT-X-STREAM-INF.
For example, the master playlist may contain multiple video streams like this:
Same resolution, different BANDWIDTH
Or different resolutions (for example 720p, 1080p, etc.)
My question is:
How many video variants are actually downloaded when using AVAssetDownloadConfiguration without specifying any variantQualifiers?
In other words:
If the master playlist contains multiple video variants, will the download task fetch only one variant, or multiple variants?
Does the behavior differ depending on whether the variants differ only by BANDWIDTH or also by RESOLUTION?
What I observed in testing
In my tests, I always end up with only one video variant downloaded, specifically the one with the highest BANDWIDTH parameter. In the m3u8 files I tested, all video variants had identical parameters (resolution, codec, frame rate, etc.) and differed only by the BANDWIDTH attribute in the master playlist.
However, when inspecting the downloaded .movpkg, I noticed something interesting in boot.xml.
It lists two video streams:
one with complete="true" (the one with highest bandwidth)
another with complete="no" (the one with lowest bandwidth)
I actually had 3 video streams listed in m3u8, but the one with middle bandwidth wasn't listed in boot.xml file at all.
There are also additional streams for audio and subtitles in boot.xml file.
This made me wonder whether the system initially attempts to download another video variant (possibly a lower bitrate one), but then switches to the highest-quality variant and only completes that one.
Additional question about variantQualifiers
If I provide a predicate such as:
NSPredicate(format: "peakBitRate > 0")
which should theoretically match all variants, will the download task attempt to download all matching video variants, or will it still select only one?
Summary
So the main questions are:
Without variantQualifiers, does AVAssetDownloadConfiguration always download a single video variant, and if so, how is it chosen?
Does the behavior differ if variants have different resolutions vs only different bitrates?
When a predicate matches multiple variants, can multiple video variants actually be downloaded in a single .movpkg?
Why might boot.xml list multiple video streams when only one appears to be fully downloaded?
Any clarification on the intended behavior would be greatly appreciated.
Thanks!
In Instruments, I'm seeing "Zero Time Stamp" events in the "Audio Server" lane.
What does that mean?
Hello,
I’m experiencing a severe performance degradation when running CoreML models on a live AVFoundation video feed compared to offline or synthetic inference. This happens across multiple models I've converted (including SCI, RTMPose, and RTMW) and affects multiple devices.
The Environment
OS: macOS 26.3, iOS 26.3, iPadOS 26.3
Hardware: Mac14,6 (M2 Max), iPad Pro 11 M1, iPhone 13 mini
Compute Units: cpuAndNeuralEngine
The Numbers
When testing my SCI_output_image_int8.mlpackage model, the inference timings are drastically different:
Synthetic/Offline Inference: ~1.34 ms
Live Camera Inference: ~15.96 ms
Preprocessing is completely ruled out as the bottleneck. My profiling shows total preprocessing (nearest-neighbor resize + feature provider creation) takes only ~0.4 ms in camera mode. Furthermore, no frames are being dropped.
What I've Tried
I am building a latency-critical app and have implemented almost every recommended optimization to try and fix this, but the camera-feed penalty remains:
Matched the AVFoundation camera output format exactly to the model input (640x480 at 30/60fps).
Used IOSurface-backed pixel buffers for everything (camera output, synthetic buffer, and resize buffer).
Enabled outputBackings.
Loaded the model once and reused it for all predictions.
Configured MLModelConfiguration with reshapeFrequency = .frequent and specializationStrategy = .fastPrediction.
Wrapped inference in ProcessInfo.processInfo.beginActivity(options: .latencyCritical, reason: "CoreML_Inference").
Set DispatchQueue to qos: .userInteractive.
Disabled the idle timer and enabled iOS Game Mode.
Exported models using coremltools 9.0 (deployment target iOS 26) with ImageType inputs/outputs and INT8 quantization.
Reproduction
To completely rule out UI or rendering overhead, I wrote a standalone Swift CLI script that isolates the AVFoundation and CoreML pipeline. The script clearly demonstrates the ~15ms latency on live camera frames versus the ~1ms latency on synthetic buffers.
(I have attached camera_coreml_benchmark.swift and coreml model (very light low light enghancement model) to this repo on github https://github.com/pzoltowski/apple-coreml-camera-latency-repro).
My Question:
Is this massive overhead expected behavior for AVFoundation + Core ML on live feeds, or is this a framework/runtime bug? If expected, what is the Apple-recommended pattern to bypass this camera-only inference slowdown?
One think found interesting when running in debug model was faster (not as fast as in performance benchmark but faster than 16ms. Also somehow if I did some dummy calculation on on different DispatchQueue also seems like model got slightly faster. So maybe its related to ANE Power State issues (Jitter/SoC Wake) and going to fast to sleep and taking a long time to wakeup? Doing dummy calculation in background thought is probably not a solution.
Thanks in advance for any insights!
In iOS 26, AVSpeechSynthesizer read Mandarin into Cantonese pronunciation.
No matter how you set the language, and change the settings of my phone system, it doesn't work.
let utterance = AVSpeechUtterance(string: "你好啊")
//let voice = AVSpeechSynthesisVoice(language: "zh-CN") // not work
let voice = AVSpeechSynthesisVoice(language: "zh-Hans") // not work too
utterance.voice = voice
et synth = AVSpeechSynthesizer()
synth.speak(utterance)
Topic:
Media Technologies
SubTopic:
General
Tags:
Speech
Internationalization
Localization
AVFoundation
Please include the line below in follow-up emails for this request.
Case-ID: 11089799
When using AVSpeechUtterance and setting it to play in Mandarin, if Siri is set to Cantonese on iOS 18, it will be played in Cantonese. There is no such issue on iOS 17 and 16.
1.let utterance = AVSpeechUtterance(string: textView.text)
let voice = AVSpeechSynthesisVoice(language: "zh-CN")
utterance.voice = voice
2.In the phone settings, Siri is set to Cantonese
My code that streams buffers into AVAudioPlayerNode is stuttering when the buffer is finished and before the next one is played.
while engine.isRunning {
let framesToCopy = min(buffer.frameLength - framePosition, Self.BufferSize)
let srcRaw = UnsafeRawPointer(srcPtr)
let playbackBuffer = AVAudioPCMBuffer(pcmFormat: buffer.format, frameCapacity: Self.BufferSize)!
let playbackPtr = playbackBuffer.floatChannelData![0]
let destRaw = UnsafeMutableRawPointer(mutating: playbackPtr)
memcpy(destRaw, srcRaw, Int(framesToCopy) * MemoryLayout<Float>.stride)
srcPtr = srcPtr.advanced(by: Int(framesToCopy))
playbackBuffer.frameLength = framesToCopy
await player.scheduleBuffer(playbackBuffer,
at: nil,
options: [],
completionCallbackType: .dataRendered)
}
I've tried to schedule multiple buffers at once using a combination of both the synchronous and async versions of scheduleBuffer because I thought the delay might be but it still stutters and the data copied into the playbackBuffer matches the source buffer. I've tried all combinations of options and completionCallbackType but no luck.
I've tried increasing the buffer size but that just spaces out the stutters because the buffer is larger.
What am I missing about this API?
Hi everyone,
We’re encountering an issue where AudioQueueNewOutput blocks indefinitely and never returns, and we’re hoping to get some insight or confirmation if this is a known behavior/regression on newer iOS versions.
Issue Description
When triggering audio playback, we create an output AudioQueue using AudioQueueNewOutput.
On some devices, the call hangs inside AudioQueueNewOutput and never returns, with no OSStatus error and no subsequent logs.
This behavior is reproducible mainly on iOS 18.3.
Earlier iOS versions do not show this issue under the same code path.
if (audioDes)
{
mAudioDes.mSampleRate = audioDes->mSampleRate;
mAudioDes.mBitsPerChannel = audioDes->mBitsPerChannel;
mAudioDes.mChannelsPerFrame = audioDes->mChannelsPerFrame;
mAudioDes.mFormatID = audioDes->mFormatID;
mAudioDes.mFormatFlags = audioDes->mFormatFlags;
mAudioDes.mFramesPerPacket = audioDes->mFramesPerPacket;
mAudioDes.mBytesPerFrame = audioDes->mBytesPerFrame;
mAudioDes.mBytesPerPacket = audioDes->mBytesPerFrame;
mAudioDes.mReserved = 0;
}
// Create AudioQueue for output
OSStatus status = AudioQueueNewOutput(
&mAudioDes,
AQOutputCallback,
this,
NULL,
NULL,
0,
&audioQueue
);
code-block
The thread blocks inside AudioQueueNewOutput, and execution never reaches the next line.
Additional Notes / Observations
ASBD is confirmed to be valid
Standard PCM output
Sample rate, channels, bytes per frame/packet all consistent
Same ASBD works correctly on earlier iOS versions
AudioQueue is created on a background thread
Not on the main thread
Not inside the AudioQueue callback
On first creation, AVAudioSession may not yet be active
setCategory and setActive:YES may be called shortly before creating the AudioQueue
There may be a timing window where the session is still activating
Issue is reported mainly on iOS 18.3
Multiple user reports point to iOS 18.3 devices
Same code path works on iOS 17.x and earlier
No OSStatus error is returned — the call simply never returns.
Questions
Is it expected that AudioQueueNewOutput can block indefinitely while waiting for AVAudioSession / audio route / HAL readiness?
Have there been any behavior changes in iOS 18.3 regarding AudioQueue creation or AudioSession synchronization?
Is it unsafe to call AudioQueueNewOutput before AVAudioSession is fully active on recent iOS versions?
Are there recommended patterns (or delays / callbacks) to ensure AudioQueue creation does not hang?
Any insight or confirmation would be greatly appreciated.
Thanks in advance!
I read somewhere that the frames are returned in decode order instead of presentation order when using AVAssetReader. The documentation seems sparse on the subject. I have so far failed to find a video file where the frames are not returned in presentation order.
Can anyone confirm the frames are actually returned in decode order?
Hello,
I am currently developing a live streaming application using AVPlayer to play LL-HLS (Low-Latency HLS) content.
During our testing phase, we consistently encountered the following error in the logs:
CoreMediaErrorDomain Code=-15517
The challenge we are facing is that the error description is quite vague. It only provides cryptic messages such as "Key not found" or "No value information," which makes it extremely difficult to identify the root cause or perform a deep-dive analysis.
I have searched through the official Apple Developer documentation and technical notes, but I couldn’t find any specific reference to what Code -15517 signifies in the context of LL-HLS or CoreMedia.
Regarding this issue, I have the following questions:
What is the specific meaning of this error code (-15517)? Does it relate to missing tags in the HLS manifest, or is it an internal state issue within the AVPlayer stack?
Specifically, I would like to know if this is a critical error that disrupts playback, or if it is just a warning that can be safely ignored.
Is there any additional logging or debugging tool you would recommend to further investigate "Key not found" issues in LL-HLS?
Any insights or guidance from the community or Apple engineers would be greatly appreciated.
Thank you in advance for your help.
I just bought a monitor S2725QC from Dell Technologies and isn't fully-integrated with MacOS even though it says on the website it is compatible with MacOS.
https://www.dell.com/en-us/shop/all-monitors/sac/monitors/all-monitors/macos-compatible?appliedRefinements=51765
The screen brightness and volume control buttons don't work with the monitors (I have two). What can I do in terms of writing code with Dell Monitor SDK and MacOS Frameworks/Technologies?
I want to:
Run ARKit on the main rear camera, and while it's running shoot high resolution pictures on the wide camera, without disturbing the AR tracking.
Is this possible?
I'm relatively new to Swift development (and native iOS development for that matter)
I've got an iOS app that uses the iPhone / iPad built in cameras, and am looking to make this more compatible with macOS.
Using the normal AVCaptureDevice.DiscoverySession I seem to get the iPhone Continuity Camera and the in-built MacBook Pro camera but I don't see other input devices that I see in QuickTime Player (for example) such as connected external cameras or Virtual Inputs provided by NDI Virtual Input and OBS.
Is there a way to see access these without a specific Mac build (as the rest of the functionality works great, and I'd rather not diverge the codebase too much as it's easier to update one app than two!
I have a question regarding the behavior of AVAudioSession.sharedInstance().outputVolume.
Observed behavior:
When the app is in the foreground, I read audioSession.outputVolume (for example, 0.1).
The app is then moved to the background.
While the app is in the background, the user changes the system volume using the hardware buttons (for example, to 0.5).
When the app returns to the foreground, audioSession.outputVolume still reports the previous value (0.1).
From my testing, outputVolume only seems to update when the system volume is changed while the app is in the foreground. Volume changes made while the app is in the background are not reflected when the app returns to the foreground.
Questions:
According to Apple’s documentation for AVAudioSession.outputVolume:
“The systemwide output volume set by the user.”
https://developer.apple.com/documentation/avfaudio/avaudiosession/outputvolume
However, based on our testing on iOS 18.6.2 and iOS 18.1, the observed behavior seems to differ from this description.
Questions:
The documentation states that outputVolume represents the system-wide volume set by the user. In our testing, the value does not reflect volume changes made while the app is in the background and only updates when the app is in the foreground.Is this the expected behavior of AVAudioSession.outputVolume?
Is there any other recommended way in Swift to retrieve the current system volume that reflects user changes made both while the app is in the foreground and while it is in the background?
Any clarification on the intended behavior or recommended handling would be greatly appreciated.
I’m building a teleprompter-style app that relies on Picture in Picture.
PiP starts correctly on device.
Everything works — until another app (e.g. TikTok / Instagram) starts active video recording.
When camera capture begins in the foreground app, iOS terminates my PiP session.
Some teleprompter apps appear to keep PiP active while recording in other apps, so I’m trying to understand the recommended architectural pattern for this scenario.
Is there a documented approach or best practice to keep PiP stable during third-party camera capture?
Looking specifically for guidance on the correct AVKit / AVAudioSession configuration for this use case.
According to the documentation (https://developer.apple.com/documentation/avfoundation/avcontentkeyrequest/originatingrecipient?changes=_3&language=objc), starting with ios 18.4, I can get AVContentKeyRecipient from AVContentKeyRequest. But when I try to get it, I get a crash. What could be the issue?
I want to note that I add the asset to the AVContentKeySession using the addContentKeyRecipient method (https://developer.apple.com/documentation/avfoundation/avcontentkeysession/addcontentkeyrecipient(_:)?changes=_3&language=objc).
VideoMaterial Black Screen on Vision Pro Device (Works in Simulator)
App Overview
App Name: Extn Browser
Bundle ID: ai.extn.browser
Purpose: A visionOS web browser that plays 360°/180° VR videos in an immersive sphere environment
Development Environment & SDK Versions
Component
Version
Xcode
26.2
Swift
6.2
visionOS Deployment Target
26.2
Swift Concurrency
MainActor isolation enabled
App is released in the TestFlight.
Frameworks Used
SwiftUI - UI framework
RealityKit - 3D rendering, MeshResource, ModelEntity, VideoMaterial
AVFoundation - AVPlayer, AVAudioSession
WebKit - WKWebView for browser functionality
Network - NWListener for local proxy server
Sphere Video Mechanism
The app creates an immersive 360° video experience using the following approach:
// 1. Create sphere mesh (10 meter radius for immersive viewing)
let mesh = MeshResource.generateSphere(radius: 10.0)
// 2. Create initial transparent material
var material = UnlitMaterial()
material.color = .init(tint: .clear)
// 3. Create entity and invert sphere (negative X scale)
let sphere = ModelEntity(mesh: mesh, materials: [material])
sphere.scale = SIMD3<Float>(-1, 1, 1) // Inverts normals for inside-out viewing
sphere.position = SIMD3<Float>(0, 1.5, 0) // Eye level
// 4. Create AVPlayer with video URL
let player = AVPlayer(url: videoURL)
// 5. Configure audio session for visionOS
let audioSession = AVAudioSession.sharedInstance()
try audioSession.setCategory(.playback, mode: .moviePlayback, options: [.mixWithOthers])
try audioSession.setActive(true)
// 6. Create VideoMaterial and apply to sphere
let videoMaterial = VideoMaterial(avPlayer: player)
if var modelComponent = sphere.components[ModelComponent.self] {
modelComponent.materials = [videoMaterial]
sphere.components.set(modelComponent)
}
// 7. Start playback
player.play()
ImmersiveSpace Configuration
// browserApp.swift
ImmersiveSpace(id: appModel.immersiveSpaceID) {
ImmersiveView()
.environment(appModel)
}
.immersionStyle(selection: .constant(.mixed), in: .mixed)
Entitlements
<!-- browser.entitlements -->
<key>com.apple.security.app-sandbox</key>
<true/>
<key>com.apple.security.network.client</key>
<true/>
<key>com.apple.security.network.server</key>
<true/>
Info.plist Network Configuration
<key>NSAppTransportSecurity</key>
<dict>
<key>NSAllowsArbitraryLoads</key>
<true/>
</dict>
The Issue
Behavior in Simulator: Video plays correctly on the inverted sphere surface - 360° video is visible and wraps around the user as expected.
Behavior on Physical Vision Pro: The sphere displays a black screen. No video content is visible, though the sphere entity itself is present.
Important: Not a DRM/Licensing Issue
This issue is NOT related to Digital Rights Management (DRM) or FairPlay. I have tested with:
Unlicensed raw MP4 video files (no DRM protection)
Self-hosted video content with no copy protection
Direct MP4 URLs from CDN without any licensing requirements
The same black screen behavior occurs with all unprotected video sources, ruling out DRM as the cause.
(Plain H.264 MP4, no DRM)
Screen Recording: Working in Simulator
The following screen recording demonstrates playing a 360° YouTube video in the immersive sphere on the visionOS Simulator:
https://cdn.commenda.kr/screen-001.mov
This confirms that the VideoMaterial and sphere rendering work correctly in the simulator, but the same setup shows a black screen on the physical Vision Pro device.
Observations
AVPlayer status reports .readyToPlay - The video appears to load successfully
VideoMaterial is created without errors - No exceptions thrown
Sphere entity renders - The geometry is visible (black surface)
Audio session is configured - No errors during audio session setup
Network requests succeed - The video URL is accessible from the device
Same result with local/unprotected content - DRM is not a factor
Console Logs (Device)
The logging shows:
Sphere created and added to scene
AVPlayer created with correct URL
VideoMaterial created and applied
Player status transitions to .readyToPlay
player.play() called successfully
Rate shows 1.0 (playing)
Despite all success indicators, the rendered output is black.
Questions for Apple
Are there known differences in VideoMaterial behavior between the visionOS Simulator and physical Vision Pro hardware?
Does VideoMaterial(avPlayer:) require specific video codec/format requirements that differ on device? (The test video is a standard H.264 MP4)
Is there a required Metal capability or GPU feature for VideoMaterial that may not be available in certain contexts on device?
Does the immersion style (.mixed) affect VideoMaterial rendering on hardware?
Are there additional entitlements required for video texture rendering in RealityKit on physical hardware?
Attempted Solutions
Configured AVAudioSession with .playback category
Added delay before player.play() to ensure material is applied
Verified sphere scale inversion (-1, 1, 1)
Tested multiple video URLs (including raw, unlicensed MP4 files)
Confirmed network connectivity on device
Ruled out DRM/FairPlay issues by testing unprotected content
Environment Details
Device: Apple Vision Pro
visionOS Version: 26.2
Xcode Version: 26.2
macOS Version: Darwin 25.2.0
Dear Support Team,
I am writing to seek technical assistance regarding a persistent issue with Dolby Vision exporting in DaVinci Resolve 20 on my iPad Pro 12.9-inch (2021, M1 chip) running iPadOS 26.0.1.
The Issue:
Despite correctly configuring the project for a Dolby Vision workflow and successfully completing the dynamic metadata analysis, the "Dolby Vision Profile" dropdown menu (and related embedding options) is completely missing from the Advanced Settings in the Deliver page.
My Current Configuration & Steps Taken:
Software Version: DaVinci Resolve Studio 20 (Studio features like Dolby Vision analysis are active and functional).
Project Settings: Color Science: DaVinci YRGB Color Managed.
Dolby Vision: Enabled (Version 4.0) with Mastering Display set to 1000 nits.
Output Color Space: Rec.2100 ST2084.
Color Page: Dynamic metadata analysis has been performed, and "Trim" controls are functional.
Export Settings:
Format: QuickTime / MP4.
Codec: H.265 (HEVC).
Encoding Profile: Main 10.
The Problem: Under "Advanced Settings," there is no option to select a Dolby Vision Profile (e.g., Profile 8.4) or to "Embed Dolby Vision Metadata."
Potential Variables:
System Version: I am currently running iPadOS 26.
Apple ID: My iPad is currently not logged into an Apple ID. I suspect this might be preventing the app from accessing certain system-level AVFoundation frameworks or Dolby DRM/licensing certificates required for metadata embedding.
Could you please clarify if the "Dolby Vision Profile" option is dependent on a signed-in Apple ID for hardware-level encoding authorization, or if this is a known compatibility issue with the current iPadOS 26 build?
I look forward to your guidance on how to resolve this.
Best regards,
INSOFT_Fred
I am unable to find any clearcut documentation on configuring AVCaptureSession pipeline to capture video with proResRAW codec type, which is 16 bit format. Is it supported only with AVCaptureMovieFileOutput or one can have AVCaptureVideoDataOutput emitting 16-bit sample buffers that can be vended to AVAssetWriter?