Sunday, September 23, 2018

Data, Analytics and Algorithm as Fetish and the Semiotics of Fake Facts


 (Pix c Larry Catá Backer 2017)

Generally one ought to become worried--as one ought often to be worried in the United States--when  intellectuals start using ordinary words as fetishes.  One speaks of abstract fetishes here rather than either the traditional tangible fetish (an inanimate object revered or utilized for its connection with extra human forces) or increasingly in sex obsessed America with forms of sexual desire gratified through objects or performance (usually to an extent judged abnormal by those charged with defining the "normal").  Fetishes, then, are objects with utility; they are objects invested with properties that belie their "ordinary use" and indeed, their power comes from the extraordinary effect they might have when invoked during the course of certain rituals designed to "activate" them. Abstract fetishes, on the other hand, are activated by invocation.  And in that invocation through utterance they can be quite powerful. Americans are found of the use of  certain terms for that effect: "liberal," "racist," "Islamophobe," "antisemite," among other even more powerful invocation words projected through utterance at a specific target, have been increasingly used in that way, and sometimes with extraordinary effect.  What makes them powerful is their ability to combine data, analysis, and judgment within one word without the bother of exposing any of those elements to inspection or response.

Now ironically, data, analytics, and especially algorithm themselves may also be acquiring fetish cult status in the U.S. (and elsewhere of course). They have ceased to serve primarily to mean something in an of themselves. That original "something" might usefully be understood as the core of semiotic triadic relations between object, sign, and interpretant (for easy to read background, here). Instead they now serve as judgments which when invoked are expected to have a  calculated set of responses both against those against whom it is being used, and from those who are connected to them (employers, customers, friends, family, the state, etc.).  One sees this in a number of distinct contexts around either the value or threat posed by "big data", data analytics, and especially "algorithms." These are understood as "things" with specific powers that can be invoked by those with the technology and vision to use them (some discussion of sources here).  

This post attempts a very simple, and perhaps simpleminded dissection of the fetish of data, of analytics and of algorithms. The context is facial recognition (which has also assumed a fetish status in its own right).  The object is to start, very tentatively, to consider the difficulties of extracting the meaning of each of those terms in a context in which law, policy, business are increasingly relying on the three in their regulatory structures, compliance, risk management, social control, and policy judgments. The subject around which this discussion will be attempted is facial recognition.  In the process one can more clearly see how carelessness--strategic or otherwise--can turn these fetishes into quite powerful tokens of misdirection, much less misinformation. I will take them one at a time.   


1. The relationship between data, analytics and algorithmData are basic bits of information. Notice I did not say "facts" (more on that in a bit. Analytics use data to produce a judgment of some kind.  Notice I did not say "conclusion" more on that as well below. Lastly algorithms are the means through which the judgments of analytics produce consequences, either direct or indirect. Thus, data, standing alone has "nothing to say" other than "you have singled me out." Analytics requires data but invests it with premises and structures relationships. And algorithms are the means by which one can  assess analytics to whatever end one has the will or imagination (or technology) to make happen.

2. In the context of facial recognition, data might be the bit of information that constitutes information which in the aggregate can be used to distinguish one "face" from another,  That entails some premises (e.g., what constitutes a face? what portion of the face is necessary for building distinctive constructs that corresponds to individuals from which these bits of data are extracted, how are such data to be extracted (glasses, contact lenses, make-up, hair dye, haircuts, facial jewelry, etc.). Analytics are the relationships among the bits of information that can be deemed to constitute an autonomous face. It includes certain premises (e.g., that faces are different, that they may be distinguishable from one another, that the differences do not repeat, etc. ).  Note that analytics looks backwards toward data and also forward to algorithm.  It looks backward in the sense that the premises of the analytics infuse determinations of bits of information that must be harvested and those bits of information deemed insignificant for the purpose of analytics.  Analytics looks forward on the sense that the focus of analytics must be shaped to maximize its value to the efficient application of algorithm (e.g., in the case of facial recognition: note presence, bar from purchase, bar from entering, bar from using, reduce cost, etc.). 

3. Big Data and AI Add Little to this except a fear factor and a management of human discretion in data, analytics and algorithm. It follows that "big data" is nothing more than lots of data (how much is enough to qualify is itself a subject of analytics).  And artificial intelligence or machine learning pr whatever other marketing enhancing term is chosen refers to nothing more than the capacity of machines to engage in analytics or (and with perhaps a bit more trepidation) or to apply and create algorithms without direct human intervention. Of course, with respect to the latter the notion of human passivity is itself an active choice.  People remain passive in the face of machine learning and A.I.  because they choose to, because it serves their purposes, because it advances interests.  To that extent one ought to fear the human in A.I. rather than the operating systems that  represents. In the context of facial recognition, big data references both the quantity of bits of information necessary to make big data analytics possible (the amount of bits necessary to construct autonomous and unique "faces" attributable to unique individuals) and the capacity to process that information.  It also references the analytic capabilities necessary to translate bits to useful judgement (e.g., a face tied to an identity/individual).  Predictive analytics, a term sometimes used (because it is both useful and markets itself for its objective), connects one to A.I. in the sense that this term (including a wide variation around machine learning or machine intelligence) can be understood as imbuing analytics with the capacity to make choices  (apply consequences)  based on analytics without human intervention (face recognition, plus information about the individual represented by the "face" (the consequential normative element derived form the application of other analytics to other streams of bits of information (e.g., association with certain undesirable people, purchase or debt payment histories) produces a judgment: prohibit purchase of airplane tickets, rejection of loans, arrest warrant issued, etc.).    

4. Data. Data us the ultimate semiotic.  It is everything and nothing.  It is the presence or absence of things.  It does not exist in itself but is singled out for the purpose of signification (that is of serving as the bits necessary for specific analytics).  Data are not "facts"--whatever one might argue "facts" might be.  Rather they include anything that serves a s a basis of signification--traditional facts (i.e., grass in a specific place) or opinion (i.e., people who are willing to say publicly that they admire the president in year X). The problem arises when data is constructed to serve as its own analytics (counting membership in racial groups by defining race with reference to a "one drop" principle). The later becomes important when data serves a political purpose (e.g., the U.S. census).  The second problem occurs in choosing data for signification.  Data as object cannot be available for signification until it is identified.  But the process of identification is precisely the place where the effort toward a specific signification causes a strategic misuse of data identification.  The result is the construction of a data fetish. But people and communities do this all the time.  Take a simple example: employment. Signification will be different depending on how I identify its object: everyone who received a paycheck for a least a week versus everyone who is working a full time job for at least 4 weeks. The variations are endless. Or consider faculty productivity. If what is measured is size of grant, then two things happen.  The first is that the signification changes the character of an object--only faculty with grants are productive.  The second thing it does is to change the consequences of algorithm--only productive faculty get rewarded. Yet the entire construct is built either on a falsehood (only faculty with grants are productive) or it seeks to back end a specific analytics by tailoring the object to fit the needs of a targeted signification. What this suggests is that while everything can be objectified as data--as a bit of information--there is nothing inherently irrational or inevitable about the character of a bit of information that exists outside of the ideological, political or practical needs to brings it to attention. Note it is not suggested that information bits are brought into existence--the focus is on attention.  Bits of information exist as data only when chosen for that purpose by an interpretant in need of an object onto which to convey signification.

5. Analytics.  Analytics correspond to the sign in semiotics; analytics is the means through which objects acquire signification.  Before analytics, objects merely "are." After signification, "objects" are judged--that is they are imbued with meaning. But analytics themselves are the creatures of the ideologies within which analysis itself is given signification. The Americans analyze race and ethnicity, for example.  But that analysis acquires its own signification as it applies the principles and premises that underlie the decision first to choose specific objects in specific form (the politics of data) and then to apply a specific catalogue of metrics (the politics of judging).  These analytics must conform to social and political expectation--as well as to social judgments. Drinking is now understood as a vice--data and analytics are deployed to suggest there is something intrinsic in this cultural judgement, especially as it touches on drinking by youth, and its effects on the management of relations among the sexually active. This is not a criticism as much as it is a necessary recognition that there is nothing either natural or inevitable about analytics (itself already precariously built on strategic objectification). And there is irony--"getting the analysis right" now acquires a new and sadly perverse meaning.  That is not because there is anything wrong with this sort of contingent and strategic analytics--rather it is because people work furiously hard to pretend that it does not exist--and they draw on strategic analytics (piling irony higher) to "prove" it. Because the core function of analytics is judgment, it is easy enough to forget that those judgments are contingent. Facial recognition is fairly straightforward.  In its analytics the connection between object and signification is direct. One collects bits of information to be able to distinguish one cluster of information as a "face" unique from other clusters of information that constitutes another unique "face."The signification is "face" is important, but not inherently so.  It acquires importance (utility or exogenous effect) only when it is taken up by an interpretant to some effect--that is when analytics supplies algorithm,

6. Algorithm. The algorithm embraces the role of the semiotic interpretant. Americans are being taught to fear the algorithm the way their ancestors were taught to fears the creatures that inhabited forests at night.  Yet algorithms are merely more systematic versions of societal action that is as old as community--the judging of others and their discipline for the purpose of furthering the interests of those in control of judging and discipline. But of course, people have been taught that there is something mystical and essential in numbers that principles and moral orders lack. Yet people forget that the language of numbers is nothing more than that--another language that may be invoked for all sorts of aims.  Just as in language all is dependent on shared meaning of words and their contextual deployment, so in quantitative languages all depends on assumptions and the choices made for aligning relationships and consequences. Not, of course, to understand naturally occurring phenomena, but rather to invest the judgment of analytics with consequences. And that is the point--algorithms are merely a systematic way to organize principles for attaching consequences to the product of analytics.  It is moral philosophy made manifest through the practice of interpretation, which now acquires a more comprehensive meaning. To that end, of course, facial recognition is a pathway to a greater objective that can be realized only through algorithm.  That is, neither the information bits nor their analytics provide anything interesting other than a confirmation of a premises that with sufficient data it is possible to distinguish faces as the marker of individuals.Here is where algorithm becomes the scary fetish into which it has been converted.  But fetish here is metaphor. The algorithm provides a metrics merely for efficiently imposing moral, political or social judgment (and consequences) based on the categorization made possible by analytics.  But that is possible only where algorithm combines multiple analytics to produce a consequence through an analysis of the intersection of analytics. Algorithm in the facial recognition context  provides a good example.  Facial recognition at its most basic can provide information about location.  That may be very important, but it has limited value beyond that.  However, it acquires more value when it canm serve to flag presence in the context of other action. Face recognition analytics become useful only when combined with other analytics connected to the individual with a unique "face." It is useful especially for the distribution of consequences at "crossroads." Some of these "crossroads" are state based: customs and immigration inspection, traffic stops, application for transportation, or admission to institutions (universities, hospitals, etc.), or applications for some other benefit from the state.  Many of these crossroad are private: purchases of goods and services, applications or maintenance of employment, maintenance of social relations, applications for loans or other benefits. It is in these combinations that algorithms become valuable (to its users) and threatening (to its objects). It is not so much where one is, but the combination of where one is with what one can do that makes these powerful agents of social control.

7.  Fakery.  When understood in its semiotic dimension, the relationship among data, analytics and algorithm reveals the contours of fakery. Objects come into notice only if they are signified; and signification is experienced both in each act of interpretation (a specific interpretation) and more importantly ultimately gives meaning to the object signified when the Sign is sufficiently considered (Pierce's notions of dynamic and normal interpretation). Yet none of this occurs in a vacuum.  More importantly, none of this occurs in the absence of a system of meaning plus a system of evaluation of meaning, which includes within it  keys to assessment. As a consequences, interpretation is possible only within the confines of the framing premises of the ideology within which it operates. The fakery arises not because the assessments are wrong or because there is a flaw in object selection or analytics--the fakery is built into notions that such choices, analytics and assessments are somehow absolute and fixed beyond the power of agendas or ideologies.  That misdirection arises form the "natural" conflation of the non contingent nature of quantitative symbols (e.g., the number "2" has a fixed meaning by consensus that is difficult to dispute) with the data analytics that produces fixed relationships among information bits producing a specific result (like the sum of "2" plus "2") that leads to a fixed consequence (the number "4"). Yet that misses the significant gulf between the fixity of a quantitative symbol ("2") and the contingency of the cluster of assumptions and choices that produce the relationships between data, analytics and algorithm (described above).  In the context of facial recognition, the issue of fixity and fakery tend to be minimized (without considering subversion techniques--masks, hoods, wigs, and other disguises). Here it may be second order analytics that mask contingency in quantitative analytics and policy judgments in algorithm. These might include the assumptions  about the consequences of movement to particular sites (for example the analytics that produces a presumption that travel to particular areas of a city produce a high likelihood of efforts to purchase or use interdicted drugs).  Profiling is a primitive but effective example of a way in which data driven analytics contribute to an algorithm that results in a presumption that certain "inherent" characteristics (race, religion, sex, or attributes culturally linked to any of these)  will make it more likely that people with those characteristics in certain contexts will act in certain ways. These presumptions are built on analytics that appear fixed (based on "facts") but those facts are the product of choices among information bits that might well have made a specific analytic result more likely.

8. Fixity. One can see, then, how the cultural characteristics of the fixity of quantitative objects could migrate onto an analytical environment in which quantitative analysis is grounded in ideologically and culturally contingent choices among the objects of analysis, the forms of that analysis and the consequences to be imposed.  Data analytics and the algorithms that derive from them, then, as much political as qualitative (word driven) analytics producing a logical sequences of meanings that are meant to persuade people to take a particular position with respect to an issues.  Likewise, big data analytics adds quantity but not dimension to the analysis of the fixed nature of what it considers. And lastly AI that builds algorithm merely substitutes and amplifies the ideological baselines inherent in the premises built into the judgment that is in fact the algorithm. Yet those ought to be as debatable as if words had been used to arrive at the same judgment.  Until that happens, it will be easy to stifle debate by hiding ideology in premises and assumptions, and by constructing something that looks fixed on the basis of deliberate choices to include and exclude information bits to feed into an analytics  that in turn feeds a mechanism for dispensing consequences--like regulatory systems but without the bother of words and by substituting the tyrannies of choices built into the algorithmic assumptions for that of administrative discretion.  





   


No comments: