I’ve been following this thread for many months and I think that you work is fascinating. You’ve proven that Dolby Vision is a series of software algorithms protected by a hardware check. Furthermore, you’ve made immense progress on figuring out how those algorithms work and seem to have solved P5, P8 and probably P7 MEL. I have two suggestions:
Can you find somewhere to document how these process of software encoding works?
It might be best in the future towards working towards an open-source implementation of dovi.ko that can be dropped into an existing installation. This would help to protect CoreElec and made it more widely applicable.
On the second point, it seems to me that dovi.ko has four main components: a hardware check for a valid Dolby Vision license on the SOC; an encoding library that can combine the Base Video and Dolby Vision into a single video file (TV-LED); a decoding library that takes the combined video and converts it to YCbCr (Player-Led); and an encoding library that converts non-DoVi content to DoVi (VS10). My feeling is that VS10 probably just analyzes the video, generates an RPU, and then probably processes it like a normal DoVi file.
Not sure what you are referring to here. P5, P8 and P7 MEL were all already well understood, see DoViBaker, dovi_tool, and libplacebo.
Again not sure what you are referring to here. All that I have done in this work is follow the
guidelines for inserting HDR DM metadata into an HDR picture with transfer characteristics as defined in SMPTE ST 2084 [1] for transport over baseband interfaces
as defined in Section 6 of ETSI group specification “Compound Content Management Specification”, ETSI GS CCM 001. A publicly available document from 2017 without a single reference to DV or even Dolby at all.
What I have done cannot be made into a drop-in replacement for the existing dovi.ko, almost all the work and difficulty in getting to where it is was around finding a way to actually embed the metadata as defined in the specification in CoreELEC without using any special hardware blocks in the device. The specification itself as to what is needed to be put into the top pixels is actually fairly straightforward.
That said, others can implement Section 6 of ETSI GS CCM 001, in whatever ways they want. I have made the code showing the method I have found that lets the OSD layer of CoreELEC be written to available.
Others may want to do it differently. For example, and no idea if this approach would be possible, but an (maybe far-fetched) alternative idea I had was actually to put the metadata in using Kodi itself with some sort of “subtitle” positioned over the top rows with a custom “font”.
Based on the DoViBaker source code, couldn’t you do the BL+EL combine under CoreELEC? I guess the CPU power would not be enough and the GPU would have to be used.
Yeah, it’s an implementation problem. What to do is known, no-one seems to know how to do it on CoreELEC. Can’t say I understand why no-one has done it for PC though…
Would guess the same as you that the cpu couldn’t but the gpu could, but don’t have the slightest clue about even starting to try that.
To put it in perspective, there is lots of existing code about a second OSD layer and enabling that. Despite that, I still haven’t even been able to turn that on (would be a much better approach to adding the metadata). Seemingly no-one else knows how to do that either (or anyone that does know hasn’t provided input).
I would imagine that making FEL working would be far harder and would require input from people that actually know the CoreELEC codebase. That said, I don’t see it as technically impossible, all these device can decode multiple video streams at once in hardware, supported two layers of video in hardware, and have a GPU powerful enough to play games which seems far more computationally expensive then the seemingly small amount of work needed to combine the layers.
edit: the datasheets also have mentions about “inverse quantisation” relating to connection into the VPP for a video signal. This is something that would need to be done to the EL layer, so maybe the GPU wouldn’t even be needed for that part if there is specific hardware for that purpose?
Thanks for the detailed explanation. I think I understand better now. This is about a hardware vs a software based implementation. Great work. Carry on.
Added the dovi_tv_led_no_colorimetry and dovi_tv_led_bt2020 options. Defaults to dovi_tv_led_no_colorimetry = Y.
Added on-the-fly DV metadata generation for HDR10+ files. Same method used as DoVi_Scripts / dovi_tool DV generation from HDR10+ data. The metadata produced on-the-fly is identical to pre-processing the files.
If there was a way to generate the metadata, that could be implemented on-the-fly within CoreELEC, then sure. Actually adding the metadata is easy - but need to have a, ideally good, way to determine what values to fill the metadata in with.
Do you know of any way to generate the metadata that may be appropriate?
The only tool I am aware of that can actually generate metadata (without using HDR10+ data) is cm_analyze . Don’t really think that is going to be able to be used.
What is the use case / benefit?
It would be straightforward to just add in a level 254 extension block to the metadata CMv2.9 metadata - it would then be CMv4.0. But that does what does that achieve?
I was thinking if it would be possible to use dovi_tools to generate the RPU dynamically for these formats. But, I think I might of misunderstood what dovi_tools can do.
I was thinking of converting CMv2.9 to CMv4.0 to avoid the black crush issue.
dovi_tools DV generation is based on having dynamic metadata passed into it. From what I can tell options are HDR10+ data, a DV xml, or the output from madVR HDR measurements.
What I added matches the HDR10+ generation option.
The source min and max pq DV metadata are based on the static HDR mastering display min and max luminance. The L1 metadata is based on the HDR10+ histogram - this is the only dynamic metadata that is generated. All the rest of the HDR10+ metadata, i.e., the curve is ignored.
I do wonder if a better conversion could be done by generating L2 metadata based on the curve though…
R3S3T_9999 states “Positive lift(offset values over 2048) for L2 trim offset ignored (works in LLDV): Offset positive Lift Test – Google Drive” so it seems to affect TV-Led."
@R3S3T_9999 Can you clarify this situation. Would there be any benefit to converting CMv2.9 to CMv4.0?
Not sure what device this comment was relating to… But it doesn’t affect this build.
The L2 data is sent to the TV as is. The steps files in that folder show a very obvious response for values over 2048.
The benefit of differing from other devices (including the ‘official’ CoreELEC builds) and transmitting the L5 metadata is also very obvious in the ‘proper’ files from that folder - the letterbox areas do not brighten when the L2 trim offset is increased over 2048.
That’s not the same. When more recent movies are mastered, the Dolby Vision engineer creates a CMv4.0 RPU. However, UHD discs do not support CMv4.0, so the original CMv4.0 is reduced to a CMv2.9 RPU losing many of the details. However, streaming services support CMv4.0 and the original RPU is sent to streaming services. The result is that the UHD contains the FEL layer and the web-dl has the CMv4.0 RPU. What R3S3T_9999 does is combine the FEL layer from the UHD and CMv4.0 from the web-dl into a single file resulting in the best possible quality.
What I’m proposing is converting the CMv2.9 to CMv4.0 to avoid a bug with encoding CMv2.9 files reporting by R3S3T_9999 .
I haven’t tried your build yet but are you saying that positive lift works in TV-LED ? I know it works fine on the ugoos in LLDV but in TV-LED on all the HDMI devices and TV I tested, positive lift is ignored.
I continue to see very obvious changes in the image as the offset values increases the whole way through the test videos in the linked folder. Clear changes happen above the 2048 value.
Converting cmv2.9 straight to 4.0 would be a bad idea because that would make the L2 trims inactive. DoVi_scripts transfers the original cmv4.0 bloc from streaming web-dl (L8 trims and L3/L9) to the cmv2.9 rpu which restores the original DV metadata the colorist did. The point of doing that is to get better tone mapping thanks to Level 8 which has more controls than cmv2.9 L2. Not to mention that L2 is just a conversion of L8. CMV2.9 also shows some black crush compared to cmv4.0 in patterns.
I wonder if devices not sending the positive lift values to the tv are a work around to avoid lifting brightness of the letterbox areas as they are also not sending L5 data for some reason.