AVFoundation

RSS for tag

Work with audiovisual assets, control device cameras, process audio, and configure system audio interactions using AVFoundation.

Posts under AVFoundation tag

200 Posts

Post

Replies

Boosts

Views

Activity

10-Bit UVC on iPadOS
Hello, I've been very familiar with the UVC Support in iPadOS ever since it launched in iOS 17. There are a number of people that use the software I've developed built around UVC and there are often queries about 8-Bit vs. 10-Bit. My understanding is that the newest UVC Spec is 1.5 which was standardised in 2012 and almost every UVC Capture Card runs at 8-Bit. The only 10-Bit Capture Card that is on my radar is the AJA U-Tap SDI, however it looks like this is 10-Bit up until the UVC Part where the 10-Bit Input is downsampled to 8-Bit. Though I have read in certain places that it works as a 10-Bit Capture Card on macOS but not on iPadOS. I was just wondering if 10-Bit via UVC is even possible on iPadOS? If there was indeed a true 10-Bit Source being passed into an iPad, would iPadOS allow it or would it be downsampled by AVFoundation so it can show up as a valid external video input? All USB Capture Cards that I have encountered use one of the following formats: kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange kCVPixelFormatType_420YpCbCr8BiPlanarFullRange kCVPixelFormatType_32BGRA So if a UVC Device delivered a 10-Bit Format, would that be accessible by iPadOS or would it fallback to these 8-Bit Formats by default? Thanks!
0
0
80
15h
UVC over MFi – Is there official support? Implementation guidance?
Hello everyone, I’m looking for more detailed information regarding UVC (USB Video Class) over MFi within the Apple ecosystem and would appreciate some clarification. I’m interested in developing (or interfacing with) an accessory that transmits video over USB using the UVC standard, and I’d like to better understand how this works within the MFi (Made for iPhone) program. Here are my main questions: 1. Do iOS devices provide native support for UVC over USB-C or Lightning within the MFi framework? 2. Are there any specific firmware or authentication requirements when the accessory is MFi-certified? 3. Does UVC support depend solely on the hardware interface (USB-C vs Lightning), or are there additional software-level requirements? 4. Is there any official documentation outlining the recommended flow for implementing UVC-based video capture accessories on iOS? From what I understand, USB-C iPads appear to offer more direct support for standard UVC devices, but it’s not entirely clear how this integrates with the MFi ecosystem with iOS, especially for commercial product development. If anyone has gone through this process or can point me to relevant technical documentation, I would greatly appreciate the guidance. Thank you!
2
0
239
2d
AVAssetDownloadConfiguration: How many video variants are actually downloaded when multiple variants exist in the HLS master playlist?
Hi, I’m trying to better understand how AVAssetDownloadConfiguration selects video variants when downloading HLS content for offline playback. Suppose I have an HLS master playlist (.m3u8) that contains several video variants defined with #EXT-X-STREAM-INF. For example, the master playlist may contain multiple video streams like this: Same resolution, different BANDWIDTH Or different resolutions (for example 720p, 1080p, etc.) My question is: How many video variants are actually downloaded when using AVAssetDownloadConfiguration without specifying any variantQualifiers? In other words: If the master playlist contains multiple video variants, will the download task fetch only one variant, or multiple variants? Does the behavior differ depending on whether the variants differ only by BANDWIDTH or also by RESOLUTION? What I observed in testing In my tests, I always end up with only one video variant downloaded, specifically the one with the highest BANDWIDTH parameter. In the m3u8 files I tested, all video variants had identical parameters (resolution, codec, frame rate, etc.) and differed only by the BANDWIDTH attribute in the master playlist. However, when inspecting the downloaded .movpkg, I noticed something interesting in boot.xml. It lists two video streams: one with complete="true" (the one with highest bandwidth) another with complete="no" (the one with lowest bandwidth) I actually had 3 video streams listed in m3u8, but the one with middle bandwidth wasn't listed in boot.xml file at all. There are also additional streams for audio and subtitles in boot.xml file. This made me wonder whether the system initially attempts to download another video variant (possibly a lower bitrate one), but then switches to the highest-quality variant and only completes that one. Additional question about variantQualifiers If I provide a predicate such as: NSPredicate(format: "peakBitRate > 0") which should theoretically match all variants, will the download task attempt to download all matching video variants, or will it still select only one? Summary So the main questions are: Without variantQualifiers, does AVAssetDownloadConfiguration always download a single video variant, and if so, how is it chosen? Does the behavior differ if variants have different resolutions vs only different bitrates? When a predicate matches multiple variants, can multiple video variants actually be downloaded in a single .movpkg? Why might boot.xml list multiple video streams when only one appears to be fully downloaded? Any clarification on the intended behavior would be greatly appreciated. Thanks!
1
0
231
3d
Massive CoreML latency spike on live AVFoundation camera feed vs. offline inference (CPU+ANE)
Hello, I’m experiencing a severe performance degradation when running CoreML models on a live AVFoundation video feed compared to offline or synthetic inference. This happens across multiple models I've converted (including SCI, RTMPose, and RTMW) and affects multiple devices. The Environment OS: macOS 26.3, iOS 26.3, iPadOS 26.3 Hardware: Mac14,6 (M2 Max), iPad Pro 11 M1, iPhone 13 mini Compute Units: cpuAndNeuralEngine The Numbers When testing my SCI_output_image_int8.mlpackage model, the inference timings are drastically different: Synthetic/Offline Inference: ~1.34 ms Live Camera Inference: ~15.96 ms Preprocessing is completely ruled out as the bottleneck. My profiling shows total preprocessing (nearest-neighbor resize + feature provider creation) takes only ~0.4 ms in camera mode. Furthermore, no frames are being dropped. What I've Tried I am building a latency-critical app and have implemented almost every recommended optimization to try and fix this, but the camera-feed penalty remains: Matched the AVFoundation camera output format exactly to the model input (640x480 at 30/60fps). Used IOSurface-backed pixel buffers for everything (camera output, synthetic buffer, and resize buffer). Enabled outputBackings. Loaded the model once and reused it for all predictions. Configured MLModelConfiguration with reshapeFrequency = .frequent and specializationStrategy = .fastPrediction. Wrapped inference in ProcessInfo.processInfo.beginActivity(options: .latencyCritical, reason: "CoreML_Inference"). Set DispatchQueue to qos: .userInteractive. Disabled the idle timer and enabled iOS Game Mode. Exported models using coremltools 9.0 (deployment target iOS 26) with ImageType inputs/outputs and INT8 quantization. Reproduction To completely rule out UI or rendering overhead, I wrote a standalone Swift CLI script that isolates the AVFoundation and CoreML pipeline. The script clearly demonstrates the ~15ms latency on live camera frames versus the ~1ms latency on synthetic buffers. (I have attached camera_coreml_benchmark.swift and coreml model (very light low light enghancement model) to this repo on github https://github.com/pzoltowski/apple-coreml-camera-latency-repro). My Question: Is this massive overhead expected behavior for AVFoundation + Core ML on live feeds, or is this a framework/runtime bug? If expected, what is the Apple-recommended pattern to bypass this camera-only inference slowdown? One think found interesting when running in debug model was faster (not as fast as in performance benchmark but faster than 16ms. Also somehow if I did some dummy calculation on on different DispatchQueue also seems like model got slightly faster. So maybe its related to ANE Power State issues (Jitter/SoC Wake) and going to fast to sleep and taking a long time to wakeup? Doing dummy calculation in background thought is probably not a solution. Thanks in advance for any insights!
4
0
414
4d
AVSpeechSynthesizer read Mandarin as Cantonese(iOS 26 beta 3))
In iOS 26, AVSpeechSynthesizer read Mandarin into Cantonese pronunciation. No matter how you set the language, and change the settings of my phone system, it doesn't work. let utterance = AVSpeechUtterance(string: "你好啊") //let voice = AVSpeechSynthesisVoice(language: "zh-CN") // not work let voice = AVSpeechSynthesisVoice(language: "zh-Hans") // not work too utterance.voice = voice et synth = AVSpeechSynthesizer() synth.speak(utterance)
3
0
562
4d
On iOS 18, Mandarin is read aloud as Cantonese
Please include the line below in follow-up emails for this request. Case-ID: 11089799 When using AVSpeechUtterance and setting it to play in Mandarin, if Siri is set to Cantonese on iOS 18, it will be played in Cantonese. There is no such issue on iOS 17 and 16. 1.let utterance = AVSpeechUtterance(string: textView.text) let voice = AVSpeechSynthesisVoice(language: "zh-CN") utterance.voice = voice 2.In the phone settings, Siri is set to Cantonese
4
1
726
5d
Async AVAudioPlayerNode.scheduleBuffer stutters
My code that streams buffers into AVAudioPlayerNode is stuttering when the buffer is finished and before the next one is played. while engine.isRunning { let framesToCopy = min(buffer.frameLength - framePosition, Self.BufferSize) let srcRaw = UnsafeRawPointer(srcPtr) let playbackBuffer = AVAudioPCMBuffer(pcmFormat: buffer.format, frameCapacity: Self.BufferSize)! let playbackPtr = playbackBuffer.floatChannelData![0] let destRaw = UnsafeMutableRawPointer(mutating: playbackPtr) memcpy(destRaw, srcRaw, Int(framesToCopy) * MemoryLayout<Float>.stride) srcPtr = srcPtr.advanced(by: Int(framesToCopy)) playbackBuffer.frameLength = framesToCopy await player.scheduleBuffer(playbackBuffer, at: nil, options: [], completionCallbackType: .dataRendered) } I've tried to schedule multiple buffers at once using a combination of both the synchronous and async versions of scheduleBuffer because I thought the delay might be but it still stutters and the data copied into the playbackBuffer matches the source buffer. I've tried all combinations of options and completionCallbackType but no luck. I've tried increasing the buffer size but that just spaces out the stutters because the buffer is larger. What am I missing about this API?
0
0
49
1w
AudioQueueNewOutput blocks indefinitely on iOS 18.3 (hangs during creation)
Hi everyone, We’re encountering an issue where AudioQueueNewOutput blocks indefinitely and never returns, and we’re hoping to get some insight or confirmation if this is a known behavior/regression on newer iOS versions. Issue Description When triggering audio playback, we create an output AudioQueue using AudioQueueNewOutput. On some devices, the call hangs inside AudioQueueNewOutput and never returns, with no OSStatus error and no subsequent logs. This behavior is reproducible mainly on iOS 18.3. Earlier iOS versions do not show this issue under the same code path. if (audioDes) { mAudioDes.mSampleRate = audioDes->mSampleRate; mAudioDes.mBitsPerChannel = audioDes->mBitsPerChannel; mAudioDes.mChannelsPerFrame = audioDes->mChannelsPerFrame; mAudioDes.mFormatID = audioDes->mFormatID; mAudioDes.mFormatFlags = audioDes->mFormatFlags; mAudioDes.mFramesPerPacket = audioDes->mFramesPerPacket; mAudioDes.mBytesPerFrame = audioDes->mBytesPerFrame; mAudioDes.mBytesPerPacket = audioDes->mBytesPerFrame; mAudioDes.mReserved = 0; } // Create AudioQueue for output OSStatus status = AudioQueueNewOutput( &mAudioDes, AQOutputCallback, this, NULL, NULL, 0, &audioQueue ); code-block The thread blocks inside AudioQueueNewOutput, and execution never reaches the next line. Additional Notes / Observations ASBD is confirmed to be valid Standard PCM output Sample rate, channels, bytes per frame/packet all consistent Same ASBD works correctly on earlier iOS versions AudioQueue is created on a background thread Not on the main thread Not inside the AudioQueue callback On first creation, AVAudioSession may not yet be active setCategory and setActive:YES may be called shortly before creating the AudioQueue There may be a timing window where the session is still activating Issue is reported mainly on iOS 18.3 Multiple user reports point to iOS 18.3 devices Same code path works on iOS 17.x and earlier No OSStatus error is returned — the call simply never returns. Questions Is it expected that AudioQueueNewOutput can block indefinitely while waiting for AVAudioSession / audio route / HAL readiness? Have there been any behavior changes in iOS 18.3 regarding AudioQueue creation or AudioSession synchronization? Is it unsafe to call AudioQueueNewOutput before AVAudioSession is fully active on recent iOS versions? Are there recommended patterns (or delays / callbacks) to ensure AudioQueue creation does not hang? Any insight or confirmation would be greatly appreciated. Thanks in advance!
0
0
50
1w
Inquiry regarding CoreMediaErrorDomain Code=-15517 during LL-HLS Live Playback
Hello, I am currently developing a live streaming application using AVPlayer to play LL-HLS (Low-Latency HLS) content. During our testing phase, we consistently encountered the following error in the logs: CoreMediaErrorDomain Code=-15517 The challenge we are facing is that the error description is quite vague. It only provides cryptic messages such as "Key not found" or "No value information," which makes it extremely difficult to identify the root cause or perform a deep-dive analysis. I have searched through the official Apple Developer documentation and technical notes, but I couldn’t find any specific reference to what Code -15517 signifies in the context of LL-HLS or CoreMedia. Regarding this issue, I have the following questions: What is the specific meaning of this error code (-15517)? Does it relate to missing tags in the HLS manifest, or is it an internal state issue within the AVPlayer stack? Specifically, I would like to know if this is a critical error that disrupts playback, or if it is just a warning that can be safely ignored. Is there any additional logging or debugging tool you would recommend to further investigate "Key not found" issues in LL-HLS? Any insights or guidance from the community or Apple engineers would be greatly appreciated. Thank you in advance for your help.
1
0
127
1w
Monitors from Dell not fully-integrated with MacOS keyboard control
I just bought a monitor S2725QC from Dell Technologies and isn't fully-integrated with MacOS even though it says on the website it is compatible with MacOS. https://www.dell.com/en-us/shop/all-monitors/sac/monitors/all-monitors/macos-compatible?appliedRefinements=51765 The screen brightness and volume control buttons don't work with the monitors (I have two). What can I do in terms of writing code with Dell Monitor SDK and MacOS Frameworks/Technologies?
1
0
107
1w
Video in "Made for iPad" apps on macOS
I'm relatively new to Swift development (and native iOS development for that matter) I've got an iOS app that uses the iPhone / iPad built in cameras, and am looking to make this more compatible with macOS. Using the normal AVCaptureDevice.DiscoverySession I seem to get the iPhone Continuity Camera and the in-built MacBook Pro camera but I don't see other input devices that I see in QuickTime Player (for example) such as connected external cameras or Virtual Inputs provided by NDI Virtual Input and OBS. Is there a way to see access these without a specific Mac build (as the rest of the functionality works great, and I'd rather not diverge the codebase too much as it's easier to update one app than two!
0
0
117
2w
AVAudioSession.outputVolume does not reflect system volume changes made while app is in background
I have a question regarding the behavior of AVAudioSession.sharedInstance().outputVolume. Observed behavior: When the app is in the foreground, I read audioSession.outputVolume (for example, 0.1). The app is then moved to the background. While the app is in the background, the user changes the system volume using the hardware buttons (for example, to 0.5). When the app returns to the foreground, audioSession.outputVolume still reports the previous value (0.1). From my testing, outputVolume only seems to update when the system volume is changed while the app is in the foreground. Volume changes made while the app is in the background are not reflected when the app returns to the foreground. Questions: According to Apple’s documentation for AVAudioSession.outputVolume: “The systemwide output volume set by the user.” https://developer.apple.com/documentation/avfaudio/avaudiosession/outputvolume However, based on our testing on iOS 18.6.2 and iOS 18.1, the observed behavior seems to differ from this description. Questions: The documentation states that outputVolume represents the system-wide volume set by the user. In our testing, the value does not reflect volume changes made while the app is in the background and only updates when the app is in the foreground.Is this the expected behavior of AVAudioSession.outputVolume? Is there any other recommended way in Swift to retrieve the current system volume that reflects user changes made both while the app is in the foreground and while it is in the background? Any clarification on the intended behavior or recommended handling would be greatly appreciated.
1
1
184
2w
Keeping PiP alive during third-party video recording (camera capture)
I’m building a teleprompter-style app that relies on Picture in Picture. PiP starts correctly on device. Everything works — until another app (e.g. TikTok / Instagram) starts active video recording. When camera capture begins in the foreground app, iOS terminates my PiP session. Some teleprompter apps appear to keep PiP active while recording in other apps, so I’m trying to understand the recommended architectural pattern for this scenario. Is there a documented approach or best practice to keep PiP stable during third-party camera capture? Looking specifically for guidance on the correct AVKit / AVAudioSession configuration for this use case.
0
0
198
3w
Crash when trying to get originatingRecipient
According to the documentation (https://developer.apple.com/documentation/avfoundation/avcontentkeyrequest/originatingrecipient?changes=_3&language=objc), starting with ios 18.4, I can get AVContentKeyRecipient from AVContentKeyRequest. But when I try to get it, I get a crash. What could be the issue? I want to note that I add the asset to the AVContentKeySession using the addContentKeyRecipient method (https://developer.apple.com/documentation/avfoundation/avcontentkeysession/addcontentkeyrecipient(_:)?changes=_3&language=objc).
1
0
210
3w
VideoMaterial Black Screen on Vision Pro Device (Works in Simulator)
VideoMaterial Black Screen on Vision Pro Device (Works in Simulator) App Overview App Name: Extn Browser Bundle ID: ai.extn.browser Purpose: A visionOS web browser that plays 360°/180° VR videos in an immersive sphere environment Development Environment & SDK Versions Component Version Xcode 26.2 Swift 6.2 visionOS Deployment Target 26.2 Swift Concurrency MainActor isolation enabled App is released in the TestFlight. Frameworks Used SwiftUI - UI framework RealityKit - 3D rendering, MeshResource, ModelEntity, VideoMaterial AVFoundation - AVPlayer, AVAudioSession WebKit - WKWebView for browser functionality Network - NWListener for local proxy server Sphere Video Mechanism The app creates an immersive 360° video experience using the following approach: // 1. Create sphere mesh (10 meter radius for immersive viewing) let mesh = MeshResource.generateSphere(radius: 10.0) // 2. Create initial transparent material var material = UnlitMaterial() material.color = .init(tint: .clear) // 3. Create entity and invert sphere (negative X scale) let sphere = ModelEntity(mesh: mesh, materials: [material]) sphere.scale = SIMD3<Float>(-1, 1, 1) // Inverts normals for inside-out viewing sphere.position = SIMD3<Float>(0, 1.5, 0) // Eye level // 4. Create AVPlayer with video URL let player = AVPlayer(url: videoURL) // 5. Configure audio session for visionOS let audioSession = AVAudioSession.sharedInstance() try audioSession.setCategory(.playback, mode: .moviePlayback, options: [.mixWithOthers]) try audioSession.setActive(true) // 6. Create VideoMaterial and apply to sphere let videoMaterial = VideoMaterial(avPlayer: player) if var modelComponent = sphere.components[ModelComponent.self] { modelComponent.materials = [videoMaterial] sphere.components.set(modelComponent) } // 7. Start playback player.play() ImmersiveSpace Configuration // browserApp.swift ImmersiveSpace(id: appModel.immersiveSpaceID) { ImmersiveView() .environment(appModel) } .immersionStyle(selection: .constant(.mixed), in: .mixed) Entitlements <!-- browser.entitlements --> <key>com.apple.security.app-sandbox</key> <true/> <key>com.apple.security.network.client</key> <true/> <key>com.apple.security.network.server</key> <true/> Info.plist Network Configuration <key>NSAppTransportSecurity</key> <dict> <key>NSAllowsArbitraryLoads</key> <true/> </dict> The Issue Behavior in Simulator: Video plays correctly on the inverted sphere surface - 360° video is visible and wraps around the user as expected. Behavior on Physical Vision Pro: The sphere displays a black screen. No video content is visible, though the sphere entity itself is present. Important: Not a DRM/Licensing Issue This issue is NOT related to Digital Rights Management (DRM) or FairPlay. I have tested with: Unlicensed raw MP4 video files (no DRM protection) Self-hosted video content with no copy protection Direct MP4 URLs from CDN without any licensing requirements The same black screen behavior occurs with all unprotected video sources, ruling out DRM as the cause. (Plain H.264 MP4, no DRM) Screen Recording: Working in Simulator The following screen recording demonstrates playing a 360° YouTube video in the immersive sphere on the visionOS Simulator: https://cdn.commenda.kr/screen-001.mov This confirms that the VideoMaterial and sphere rendering work correctly in the simulator, but the same setup shows a black screen on the physical Vision Pro device. Observations AVPlayer status reports .readyToPlay - The video appears to load successfully VideoMaterial is created without errors - No exceptions thrown Sphere entity renders - The geometry is visible (black surface) Audio session is configured - No errors during audio session setup Network requests succeed - The video URL is accessible from the device Same result with local/unprotected content - DRM is not a factor Console Logs (Device) The logging shows: Sphere created and added to scene AVPlayer created with correct URL VideoMaterial created and applied Player status transitions to .readyToPlay player.play() called successfully Rate shows 1.0 (playing) Despite all success indicators, the rendered output is black. Questions for Apple Are there known differences in VideoMaterial behavior between the visionOS Simulator and physical Vision Pro hardware? Does VideoMaterial(avPlayer:) require specific video codec/format requirements that differ on device? (The test video is a standard H.264 MP4) Is there a required Metal capability or GPU feature for VideoMaterial that may not be available in certain contexts on device? Does the immersion style (.mixed) affect VideoMaterial rendering on hardware? Are there additional entitlements required for video texture rendering in RealityKit on physical hardware? Attempted Solutions Configured AVAudioSession with .playback category Added delay before player.play() to ensure material is applied Verified sphere scale inversion (-1, 1, 1) Tested multiple video URLs (including raw, unlicensed MP4 files) Confirmed network connectivity on device Ruled out DRM/FairPlay issues by testing unprotected content Environment Details Device: Apple Vision Pro visionOS Version: 26.2 Xcode Version: 26.2 macOS Version: Darwin 25.2.0
0
0
229
3w
Missing "Dolby Vision Profile" Option in Deliver Page - DaVinci Resolve 20 on iPadOS 26
Dear Support Team, ​I am writing to seek technical assistance regarding a persistent issue with Dolby Vision exporting in DaVinci Resolve 20 on my iPad Pro 12.9-inch (2021, M1 chip) running iPadOS 26.0.1. ​The Issue: Despite correctly configuring the project for a Dolby Vision workflow and successfully completing the dynamic metadata analysis, the "Dolby Vision Profile" dropdown menu (and related embedding options) is completely missing from the Advanced Settings in the Deliver page. ​My Current Configuration & Steps Taken: ​Software Version: DaVinci Resolve Studio 20 (Studio features like Dolby Vision analysis are active and functional). ​Project Settings: Color Science: DaVinci YRGB Color Managed. ​Dolby Vision: Enabled (Version 4.0) with Mastering Display set to 1000 nits. ​Output Color Space: Rec.2100 ST2084. ​Color Page: Dynamic metadata analysis has been performed, and "Trim" controls are functional. ​Export Settings: ​Format: QuickTime / MP4. ​Codec: H.265 (HEVC). ​Encoding Profile: Main 10. ​The Problem: Under "Advanced Settings," there is no option to select a Dolby Vision Profile (e.g., Profile 8.4) or to "Embed Dolby Vision Metadata." ​Potential Variables: ​System Version: I am currently running iPadOS 26. ​Apple ID: My iPad is currently not logged into an Apple ID. I suspect this might be preventing the app from accessing certain system-level AVFoundation frameworks or Dolby DRM/licensing certificates required for metadata embedding. ​Could you please clarify if the "Dolby Vision Profile" option is dependent on a signed-in Apple ID for hardware-level encoding authorization, or if this is a known compatibility issue with the current iPadOS 26 build? ​I look forward to your guidance on how to resolve this. ​Best regards, INSOFT_Fred
0
0
142
4w