THE TOOLS OF ‘REAL’ DIGITAL CINEMA
ACQUISITION: THE EMULATION OF THE PHYSICAL AND CREATIVE CHARACTERISTICS OF 35MM FILM IN THE DIGITAL DOMAIN.
INTRODUCTION: THE TECHNICALITIES OF AESTHETICS
Video imaging, as a prosaic process relies on the photoelectrical stimulus of a sensor with an electronic signal path to an electronic acquisition media- either Ferro- magnetic tape or more latterly solid-state digital memory. The resulting images can usually be reviewed utilising the capture device an instant after capture and marked to retain or deleted out of existence.
Film imaging is based on a photo-chemical process, wherein colour coupled microscopic silver particles are ‘excited’ by focused light beams and create a chromatically opposite image to that of the real scene. This latent image has to undergo a chemical washing and fixing process before a secondary tranche of imaging media is photo-chemically activated by this negative image to produce the temporally static series of positive images that form the sequence- assuming that one has the availability of suitable exhibition hardware.
The two statements above broadly examine the fundamental physical differences of the two different imaging media and to my mind foreground the essential differences in the creative approaches to the utilisation of the two acquisition states, but the paradox of differentiation unfolding as technological achievement accelerates is succinctly described by D N Rodowick in ‘The Virtual Life of Film’ (2007: 101) “Curiously for an industry driven by innovation and market differentiation, the qualities of the ‘photographic’ and the ‘cinematic’ remain resolutely the touchstones for creative achievement in digital imaging entertainment”. This portentous contention foregrounds the aesthetic sensibilities of film over video, not only in the physical and technical craft but also the ephemeral qualities of the process- the ‘dark art’ of cinema direction and cinematography that elevate the spectatorial appreciation to that of artists at work. The perception of video as utilitarian acquisition of visual information against the nature of film as palette and canvas of the auteur and creative communicator are of course forged in the provenance of their contrasting histories, with the inception of video technology essentially driven by market forces in the USA to create a single prime-time experience across the three time zones.
In essence all forms of anthropological communication and dissemination are a CODEC- a system of encoding and decoding language in succinct and digestible pieces as a delivery mechanism for concepts examining the whole spectrum of the human condition, why then does digital imaging seek the patronage and cachet of ‘cinema’ as an idiom, a means of justification? The answer lies buried in the mechanisms of 35mm cinematography and in the three years since Rodowick’s missive, those ‘touchstones’ are being attained on several fronts, to the point at which the distinction of ‘digital cinema’ as a qualification of technical inferiority or content complementary to the lack of quality or indeed as indication of ‘cutting edge’ innovation has become irrelevant as electronic emulation of celluloid has in 2010 has reached the distinction of being undiscernibly identical.
DIGITAL NOSTALGIA-THE THREE ‘RAISON D’ETRE’ OF 35MM CINEMA ACQUISITION
There are four main physical components to the look and feel of 35mm film as a spectatorial experience, exposure latitude, the native resolving power, the time base of capture and perhaps the Holy Grail of aspiration for digital acquisition systems- depth of field. By contrast video imaging, bereft of these ‘organic’ attributes, has traditionally appeared to the viewer with a plasticised immediacy- a statement of electronic fact with little of the sublimation of passion and creativity endowed by the ‘cinema’ experience. Each of these components is in a sense a major dialect in the language of the moving image and this essay will seek to explain their functions.
Exposure latitude in conjunction with tonal range refers to the range of light conditions that can be accurately acquired during ‘recording’ whilst still maintaining some semblance of detail within the highlights and shadows of the scene. There has always existed a marked difference in the way that the two systems achieve this and the superiority of the film structure owes as much to the notion of folk memories as it does to 20th century chemical engineering. By this I refer to the evolution of the human eye and its ability to ‘see’ in the dark, to explore the shadows. Our colour vision is abandoned in shadowy conditions as the more photo-reactive yet monochrome rod receptors (which out number the colour cone receptors by 20:1) scan the shadows for threat and menace- it is certainly also a truism that most Anglo words for darkness and shadows carry with them a connotation of menace and potential danger. It is this spectatorial foible that allows film to exploit the fear in the shadows by ‘seeing’ into the dark and capturing the fleeting details that support the narrative. The physics of film latitude are based around the F-stop equation and a set of numbers most people interested in photography will be familiar with- 2/2.8/4/5.6 etc, these ascending figures represent a halving of the amount of light hitting the emulsion of the film per each stop and are derived from the square root of 2, which is 1.4. It therefore follows that 1.4x1.4 =2, 2x1.4=2.8, 2.8x1.4=4 and as the stop increases, the amount of light halves. Exposure latitude refers to the ability of the medium to record detail in a range of f stops from shadow to highlights and the average on film would be around 8 stops, which translates as thus for example- for a given exposure of f5.6, detail would be recorded from f2 to f32 beyond which the image would become either solid black or burnt out white. It is important to note the non-linearity of the scale as this is a facet in the narrative capabilities of film and again is due to the method of construction. Film generally exists as a two-stage process- negative and print-, which reveals latitude characteristics unavailable to any video imaging system in that the brighter portions of the scene cause a stronger photo-chemical reaction resulting in darker portions of the negative, so it follows that an intense degree of light is required to ‘blacken’ the negative completely. During the printing process, the opposite outcome is achieved, displaying the classic characteristics of film latitude wherein the detail is available in the highlights, but the shadows soon fall away to blackness imbuing the scene with an organic contrast and an inherent sense of mystery-the potential to thrill.
Electronic imaging has traditionally been severely disadvantaged in the area of exposure latitude due to the ‘mathematical nature’ of the photo-reactivity, but changes in workflow paths combined with the increased sensitivity and refinement of digital sensors have heralded a major stride forward in cinematic emulation for video technology. Traditionally electronic sensors were comprised of three individually colour sensitive charge-coupled devices (CCDs), reacting to the primary spectrums of light (RGB) through a beam splitter and then combined in a binary signal path to an acquisition media (traditionally magnetic tape ‘omega’ wrapped over a drum spinning at over 14,000 rpm) creating the master copy. As previously stated, it is the physical and mathematical constraints of the system that bestow the images with that plasticised ‘hyper-reality’ that typifies video capture with it’s information-platform sensibilities and the lack of tonal latitude is a major contributing factor. Unlike the negative sensitivity of film electronic imaging reacts in a positive linearity to light inputs- the brighter the portion of the scene, the greater the charge on the sensor and the more intense translation of information to the acquisition path- up to the point of peak capability beyond which the sensor cannot cope and ‘clips’ the signal. The effect of this electronic full stop is to create areas of brightness in the scene that are merely incandescent mush-devoid of detail and as a consequence any narrative potential. This is the polar opposite of the reactivity curve of film, where the negative-positive process preserves the highlights up to six stops above optimum exposure but crushes the shadows at two-three stops under – human daylight vision struggling for clarity in the shadows, whereas paradoxically, the electronic image preserves a greater sense of detail in the shadows, but not necessarily at the point of capture. The development of acceptable latitude ranges for digital cinema emulation progenate from two sources of research. The first is the development of single chip sensor technology- complementary metal-oxide semiconductor (CMOS) and the second is in improved data write speeds and ingest into post manipulation suites. The CMOS chip impact also spans the questions of resolution and DoF issues, but for the moment it also represents a colossal leap in the curve of sensitivity to light. CCD chips are analogous to electronic ‘buckets’, filling with electrical charge and despatching that information when all pixel buckets are essentially full, the shadow ones still filling as the highlights are overflowing, creating the clipping effect. Whereas single CMOS sensor pixels react instantly and deliver the information as a frame package to the signal path, preserving more of the highlight detail relative to the optimum exposure level, producing a sensitivity characteristic akin to that of film but with one important advantage- there is substantially more shadow detail preserved. With contemporary workflow ingest and processing this latent detail in the image can be boosted to produce a tonal range that now is beyond that of the capabilities of film. Post production tools can ‘skew’ the native sensitivity curve to maintain highlight detail whilst foregrounding shadow information to produce images with apparent greater latitude than film, this is the domain of the Colourist whose role in image manipulation has increased exponentially with the technology to become as crucial as that of the cinematographer in those specific aspects of creativity. Indeed the major credo for any cinematographer/director working with digital imaging should be to concentrate on maintaining detail in the highlights because the information in the shadows will be preserved, albeit not until the post process.
A wide spectrum statement with two specific values relevant to the ambitions digital cinema. Unlike the dissimilarities in the systems of latitude, both film and electronic imaging are constantly subject to mutually beneficial improvements in transfer of information, a convergence that ultimately favours digital cinema.
Of the two key sites of resolution defining the cinematic experience, the first is of course by definition is the initial point of contact with whatever reality the film makers wish to preserve- the lens. Whilst optical theory is beyond the scope and indeed interest of this essay, it is important to understand the principal of GIGO (Garbage In, Garbage Out) with regard to the necessity to optimise the information available for the sensor/film to record, the lower the resolving power of the lens the consequently lower quality image produced. Current specifications for professional image capture for cinema extend to a horizontal resolution of 4000 pixels (or 4K) for digital and low ISO film stocks may be measured at up to 8K, so we can see that there is a finite requirement and specification for the lenses to perform to. The Modulation Transfer Function (MTF) of a lens measures it’s capability to transmit cyclic changes or modulated pixel image data represented in the frequency scale Hertz. Taking the recent historical requirements for the models of cinema and television, the lenses for video cameras were required to resolve to a power of 5.5 MHz or 550 black and white cyclic changes over the vertical span of a TV picture, or under the contemporary nomenclature- 0.43 Megapixel. Clearly this fashionable digital taxonomy exposes the weakness in our accepted domestic delivery systems but also highlights the scale of ambition for mature digital cinema if we apply that measurement criteria to the new High Definition platform and discover that it requires at, maximum quality, a resolution of 1080 progressively scanned vertical lines that translate to a Megapixel value of 2.1, somewhat short of the capabilities of most mobile phones and small cameras. With the history of optical development in professional movie cameras, the resolving power and mechanisms of transfer already exist to preserve vast quantities of detail and information, it has therefore fallen to the developers of digital cinema products to homogenise the mechanics of capture in a facsimile of 35mm devices in order to benefit from the century of R&D that has characterised cinema reproduction. The availability of resolving power from certain sets of high-end professional lenses (developed by spy satellite manufacturers and the like) is beyond the mechanics of any cinema capture device and so the native resolving capability falls to that of the sensor or the piece of film.
The resolving power of a capture device and its inherent sensitivity are inexorably linked characteristics and in this area, digital imaging now with the most recent innovations performs beyond the capabilities of film. We need to return to the mechanical qualities of the CMOS sensor and understand why these
Physical absolutes are sufficiently capable of furnishing the spectator with the visual aesthetic expected of cinema. As explained earlier, the CMOS chip is a single plane sensor with pixels designated to react to different parts of the primary spectrum and returns us to the previous notion of emulating the characteristics of the human eye, by foregrounding the number of monochromatic ‘rod’ pixels over the chrominance sensitive ‘cone’ pixels. This mosaic patterning was invented by Bryce Bayer (rather ironically a Kodak physicist at the time) and with a nod to the Technicolor three strip process of the Fifties, uses twice as many green sensitive pixels to carry the ‘luminance’ or contrast information whilst the red and blue pixels record chrominance only. ‘Bayer’ patterning, expressed as GRGB, is the first stage in the CODEC of the reality in front of the camera- to all intents and purposes, the lens has preserved the image and temporality as if real but the sensor makes decisions and adds an electronic (or chemical in the case of film) character to the imaging, an interpretative version of the truth of the scene. It is this patterning that enables the manufacture of digital sensors that can now out perform film stocks in terms of sensitivity, a commercially sensitive advantage coupled with the low cost of recording media that has marginalised film origination to the upper echelons of production and aesthetic requirements.
As with the resolving requirements for lenses on TV, electronic sensors (particularly CCD devices) were also pitched at the same quality and price points to preserve the workflow qualities of the system and little more. This narrow bandwidth of information, combined with the lack of head room in the tonal range and small sensor size mark the video image as something very different from the magnificence of the cinema film image, until the emergence of ‘Full Frame’ CMOS sensors over the last half decade. A professional digital video camera sensor has been standardised at 2/3 inch in analogue systems through to digital and into high definition territory, with pro-sumer cameras struggling with even smaller chips down to 1/3inch, an extreme CODEC process to the detriment of the image preserved. Improvements in manufacturing tolerances and a cross fertilization from the digitally mature stills market have heralded the production of CMOS sensors with the same dimensional characteristics of 35mm film and with increasing commercial alacrity, available in ever decreasingly priced products. The most obvious advantage of providing a digital 35mm platform is the availability of ancillaries and accessories from an established industry, the sensor, signal path and recording media being the only deviation from the norm given that all negative film is ingested as a 4K digital intermediate for post workflow. Apart from the obvious homogenisation of resources and production practice, the development of the full frame digital sensor has had a seismic effect on the longevity of film as an origination medium. Along with the increased sensitivity and wide bandwidth of tonal range available, the big sensor cameras come equipped with that most ephemeral and aesthetic of 35mm qualities, the subject of a thousand scholarly missives – Depth of Field.
Depth of Field:
Is the term used to define the band of acceptable sharpness within a given shot or scene. The perception of focus is not merely restricted to the optical precision of
moving lens elements to converge the bent light beams onto a given receptor surface, but a combination of that and the other factors discussed above.
The most significant factor in determining DoF is in fact the size of the target sensor, not only in respect to it’s native resolving power (it’s ability to capture the photo-data from the lens) but the physical dimensions are the determining dynamic in respect to the magnification required to ‘fill’ the chip with information. At this stage it seems appropriate to foreground one of the major ‘dark art’ mechanisms of cinematography- it would seem obvious to most observers that all the above factors and that of environment and specific scene issues along with workflow practice and post production manipulation combine to produce the finished shot, requiring years of experience and practice to absorb the complicated procedures, but the application of the parameters at the ‘sharp’ end of the process is staggering in it’s simplicity. To condense this symbiosis into one sentence would be thus- every determining factor impacts upon the other mechanisms by a simple application of halving or doubling the other parameter. To explain further it would seem appropriate to follow the system from the lens through to acquisition.
The millimetre measurement of the lens refers to the distance required to ‘focus’ the gathered light onto the focal plane, but also determines the field of view and the magnification of the scene as per an aesthetic decision. So, on 35mm ‘size’ acquisition, a 50mm lens focused on a subject at five feet has the same subject size as a 100mm lens focused at ten feet or indeed, a 25mm lens focused at two and a half feet- the subject size is constant but the magnification has halved or doubled, as has the field of view determining the angle of soft background information collected by the lens. The subsequent question would involve the implications for DoF between these similar shots and the answer is that the band of in focus material is in fact constant but the real-time perception is of different shots because of the quadrupling of magnification involved between 25-100mm lenses. The longer the magnification, the flatter the image appears and so paradoxically, shallow DoF may be more apparent on wider lenses focused close on their subjects because of the amount of background ambient information available to the spectator. This principle can of be seen in the many variations of the ‘contra-zoom’ principle, using a zoom lens to smoothly ramp the magnification whilst the physical adjustment of the camera on the dolly maintains the relative subject size.
Beyond the ‘taking’ capabilities of the lens are it’s capacity to transmit and control the intensity of light collected, using the mathematical F-stop measurement and it’s real world T stop equivalent (the amount of light transmitted by a lens through to the focal plane). As previously mentioned, the ascension of the F-stop scale represents a halving of the amount of light per each full graduation. In practice, this means that the mechanical concentric leaves of the iris control constrict the light into a smaller circle of transmission, supplying the correct measured exposure to the sensor but also impacting on the DoF of the image. It is a truism to state that no image is ultimately sharp, but that it sits within the tolerances defined by the particular system in use-acceptably sharp in effect. Each closing down on the iris of one stop removes half the light and doubles the depth of field apparent on the image due to the constricting effect of the iris on the circles of confusion (the discs of light whose gradual narrowing produces the point of focus in the scene), the ultimate effect is to produce the classic ‘deep focus’ image in a combination of small magnification and restricted iris to maintain spectatorial acceptance of the scene played out in front of them.
The size and sensitivity of the target sensor determine the consequences for DoF as the image is captured. Measurement of sensor/film sensitivity is on a globally accepted parametrical scale, commonly now referred to as the ISO (a generic term for the overseeing body). In an ironclad relationship with the F-stop scale, ISO determines the platform of sensitivity on which the measurement of light intensity is performed. For a given ISO rating of 200 and an optimum exposure quotient of F4, an increase in the ISO to 400 would require half the amount of light and the lens could be set to F2.8, halving the perceived depth of field. It is therefore a simple equation to follow that the more sensitive the acquisition media the easier it is to create the deep focus effect. The size of the sensor in most digital cameras is 2/3”, smaller then a 16mm film aperture and paradoxically delivers images with too much depth of field for aesthetic sensibilities with substantially un-cinematic visuals. This is attributable to both the relative high sensitivity of the sensor (usually around ISO320 native) and the magnification necessary for a given subject size. If we apply the golden rule of cinematography it would therefore be logical to conclude that for a given lens, distance and exposure on a 35mm system, the equivalent shot would be achieved on 16mm with a lens half the length but would also transmit double the depth of field, and in essence halving the cinematic aesthetic by bestowing too much focus. For 2/3” chip digital cameras, the easy maths theory collapses, but real world comparisons have established that the equivalent to a 50mm lens on 35mm acquisition would be about 18mm, with all the obvious connotations for DoF and creativity. As a footnote to cinematic DoF, the anamorphic process (which will perhaps find a new lease of life with the new full size sensor cameras) has ingrained a most recognisable aesthetic feature with cinema audiences. Apart from the process requiring a doubling of magnification for a given shot size (and halving the DoF compared to standard 35mm), the non-spherical nature of the capture results in a unique visual artefact. The anamorph bends light to an extreme degree on the horizontal plane to squeeze a greater panorama onto a squarer aperture, but it is only the focused light that can be ‘decoded’, the out of focus highlights are subject to the same process but as they are not resolved they are recorded as elongated ovals- a classic cinematic visual and a indelible clue as to the method of acquisition.
THE AUTEURS OF WORKFLOW.
The driving force behind any technological advancement is the willingness to adopt/adapt the new boundaries offered as a creative impetus by those filmmakers willing to explore beyond the boundaries of their peers. As evidenced with the early ‘uptakers’ of digital technology such as Lars von Trier and Mike Figgis, it was not the similarity of the technology to film that was the attraction, rather the ability to capture content for long periods of time. This, coupled with extreme portability and the lack of necessity for specialist lighting, removed the traditional temporally challenging restriction of having to contain a scene within a ten-minute framework. The ‘prosumer’ DV cameras of the nineties could operate in low, naturally lit situations, recording up to ninety minutes of footage at a time, supplying the filmmaker with a broad creative canvas. The initial effect of this technology exerted a significant polarising effect on cinema- between the traditional conventions and the ‘non-directorial’ forays of Agnes Varda and the 1st person horror of ‘The Blair Witch Project (Daniel Myrick, Eduardo Sanchez, 1999). The audience were aware of and accepting of the limitations of resolution because the context of usage was outside the usual framework for cinematic imaging and the unique visual signature was a foil to the experimental narrative structuring. The ‘immediacy’ of these images presented cinema with a new reality, but always in an investigational context, culminating in ‘Russian Ark’ (Sokurov, 2002), a ninety-six minute continuous historical journey through the Hermitage. One wonders whether Sokurov’s inspiration came from the narrative idea or from the fact that ninety minute tapes existed, Hitchcock achieved the impression of a ‘one take’ film with ‘Rope’ (Hitchcock, 1948), the results only really marred by necessity of projectionists having to change reels and bouncing the invisible cuts.
What is beyond conjecture is that technical competency is a pre-requisite for any filmmaker but particularly for those individuals breaking new ground. The new auteur must comprehend their workflow before they can commit to their image capture. Digital technology supplies a unique creative environment, but the diversity of approaches requires the director/cinematographer to understand the specifics of data acquisition from lens to storage otherwise that hardware can stifle the very creativity it supports.
As Hollywood embraces digital cinema, the convergence of hardware from information gathering sectors of the media strengthens, with blockbusters such as ‘The Curious Case of Benjamin Button’ (Fincher, 2008) and ‘Avatar’ (Cameron, 2009) shot with smaller chip documentary style cameras. However, the most recent and telling trend (and a return to miniaturisation) is for DSLR cameras to support full (2k) high definition recording at progressive frame rates, thus providing full 35mm sensor acquisition (and in the case of the Canon 5DmkII, a Vistavision size sensor) in an incredibly small package. The irony is that this accessibility to the hardware of cinema does not necessarily guarantee a cinematic product; shallow depth of field requires craftsmanship to be used in any narrative context. Godard’s ‘35mm camera in the glove box’ exists for everyone to use now, but we need to understand the century of technique behind film acquisition before we can use it as a cinematic tool.
Austin, G. (2008) Contemporary French Cinema 2nd Edition, Manchester and New York: Manchester University Press
Belton, J. (2002) ‘Digital Cinema: A False Revolution’, in Braudy, L. and Cohen, M. (eds.) (2004) Film Theory and Criticism Sixth Edition New York and Oxford: Oxford University Press.
Everett, A. (2009) Digital Diaspora (SUNY Series Cultural Studies in Cinema/Video), New York: State University of New York Press
Fullerton, J. Olsson, J. (eds)(2004) Allegories of Communication: Intermedial Concerns from Cinema to the Digital (Stockholm Studies
in Cinema), New Barnet: John Libbey and Co Ltd.
Hayward, S. (2006) Cinema Studies The Key Concepts Third Edition, London and New York: Routledge
Kipnis, L. (1998) ‘Film and Changing Technologies’ in Hill, J. and Church Gibson, P. (eds.) (1998) The Oxford Guide to Film Studies, New York and Oxford: Oxford University Press.
Le Grice, M. (2001)Experimental Cinema in the Digital Age (BFI Film Classics (Paperback), London: BFI Publishing
Marchessault, J. Lord, S. (eds) (2008) Fluid Screens, Expanded Cinema, Toronto Buffalo London: University of Toronto Press.
McKernan, B. (2005) Digital Cinema The Revolution in Cinematography, Postproduction and Distribution, New York Chicago San Francisco Lisbon London Madrid Mexico City Milan New Dehli San Juan Seoul Singapore Sydney Toronto: McGraw-Hill.
Rodowick, D. N. (2007) The Virtual Life of Film, Cambridge, Massachusetts and London, England: Harvard University Press.
Rombes, N. (2009) Cinema in the Digital Age, London: Wallflower Press.
Stam, R. (2007) Film Theory An Introduction, Malden MA USA and Oxford and Carlton, Victoria: Blackwell Publishing.
Willis, H. (2008)New Digital Cinema Reinventing the Moving Image, London: Wallflower Press.
Blair Witch Project, The (Myrick, Sanchez, 1999)
Cloverfield (Reeves, 2008)
Gleaners and I, The (Varda, 2000)
Irreversible (Noe, 2002)
Public Enemies(Mann, 2009)
Russian Ark (Sokurov, 2002)
Slumdog Millionaire (Boyle, 2009)
Timecode (Figgis 2000)
24 Hour Party People (Winterbottom, 2002)
28 Days Later (Boyle, 2002)
2012 (Emmerich, 2009)