In social media, a story is a function in which the user tells a narrative or provides status messages and information in the form of short, time-limited clips in an automatically running sequence. == Definition == A story is a short sequence of images, videos, or other social media content, which can be accompanied by backgrounds, music, text, stickers, animations, filters or emojis. Social media platforms typically advance through the sequence automatically when presenting a story to a viewer. Although the sequential nature of stories can be used to tell a narrative, the pieces of a story can also be unrelated. Social media platforms that offer stories will typically have a primary story for each user which consists of everything the user posted to their story over a certain period of time, usually the most recent 24 hours. Most stories cannot be changed afterwards and are only available for a short time. Stories are almost exclusively created on a mobile device such as a smartphone or tablet computer and are usually displayed vertically. == History == In October 2013, Snapchat first introduced the story function as a series of Snaps that can together tell a narrative through a chronological order, with each Snap being viewable by all of the poster's friends and deleted after 24 hours. Stories soon surpassed private Snaps to become Snapchat's most-viewed type of post. After 2015, Snapchat introduced a feature allowing users to post private stories viewable by a chosen subset of their friends. Later other apps would copy this feature. In August 2016, Instagram introduced a stories function that deletes the content after 24 hours. Various commenters have accused the site of copying Snapchat. In February 2017, the instant messenger WhatsApp introduced the Now Status stories function in beta, which was later renamed Status. In March 2017, a story function was introduced in Facebook Messenger. In February 2018, Google launched AMP Stories, bringing a story-style format to certain Google search results on mobile devices. In August 2018, YouTube introduced a stories function that initially was limited to pictures, but was later expanded to support short video clips. The feature was shut down in June 2023. In August 2018, the GIF website Giphy introduced a story function. In March 2022, TikTok added a story feature which allowed users to create 15 second long videos that delete after 24 hours. In June 2023, Telegram CEO Pavel Durov announced stories for Telegram would be released in July 2023. In July 2023, the feature was released for premium users, and in August 2023 it was rolled out for all users. == User motivations == In 2022, a study performed by Jia-Dai (Evelyn) Lu and Jhih-Syuan (Elaine) Lin examined the various motivations for updating stories on Instagram. The researchers found a new configuration of motivations for using Instagram Stories: exploration, self-enhancement, perceived functionality, entertainment, social sharing, relationship building, novelty, and surveillance. The findings also highlighted that contribution and creation activities are likely to result in positive emotions, while creation alone predicts negative emotions while updating stories on Instagram. == Usage statistics == In 2019, around 1.5 billion people worldwide every day on average used the stories function in a social network or messenger. Younger people in particular use this function. More than 20% of people aged 18 to 24 use Instagram stories, while it is just under 2% of those over 55. In a Facebook survey of 18,000 participants from 12 countries, 68% said they used the stories function at least once a month. Stories in the areas of fashion and tourism are particularly popular. The website Fanpage Karma analyzed several Instagram accounts and determined the average reach of posts and stories per follower, concluding that posts have a higher reach than stories, which often have less than half the reach.
Manifold hypothesis
The manifold hypothesis posits that many high-dimensional data sets that occur in the real world actually lie along low-dimensional latent manifolds inside that high-dimensional space. As a consequence of the manifold hypothesis, many data sets that appear to initially require many variables to describe, can actually be described by a comparatively small number of variables, linked to the local coordinate system of the underlying manifold. It is suggested that this principle underpins the effectiveness of machine learning algorithms in describing high-dimensional data sets by considering a few common features. The manifold hypothesis is related to the effectiveness of nonlinear dimensionality reduction techniques in machine learning. Many techniques of dimensional reduction make the assumption that data lies along a low-dimensional submanifold, such as manifold sculpting, manifold alignment, and manifold regularization. The major implications of this hypothesis is that Machine learning models only have to fit relatively simple, low-dimensional, highly structured subspaces within their potential input space (latent manifolds). Within one of these manifolds, it's always possible to interpolate between two inputs, that is to say, morph one into another via a continuous path along which all points fall on the manifold. The ability to interpolate between samples is the key to generalization in deep learning. == The information geometry of statistical manifolds == An empirically-motivated approach to the manifold hypothesis focuses on its correspondence with an effective theory for manifold learning under the assumption that robust machine learning requires encoding the dataset of interest using methods for data compression. This perspective gradually emerged using the tools of information geometry thanks to the coordinated effort of scientists working on the efficient coding hypothesis, predictive coding and variational Bayesian methods. The argument for reasoning about the information geometry on the latent space of distributions rests upon the existence and uniqueness of the Fisher information metric. In this general setting, we are trying to find a stochastic embedding of a statistical manifold. From the perspective of dynamical systems, in the big data regime this manifold generally exhibits certain properties such as homeostasis: We can sample large amounts of data from the underlying generative process. Machine Learning experiments are reproducible, so the statistics of the generating process exhibit stationarity. In a sense made precise by theoretical neuroscientists working on the free energy principle, the statistical manifold in question possesses a Markov blanket.
Content-based image retrieval
Content-based image retrieval, also known as query by image content (QBIC) and content-based visual information retrieval (CBVIR), is the application of computer vision techniques to the image retrieval problem, that is, the problem of searching for digital images in large databases (see this survey for a scientific overview of the CBIR field). Content-based image retrieval is opposed to traditional concept-based approaches (see Concept-based image indexing). "Content-based" means that the search analyzes the contents of the image rather than the metadata such as keywords, tags, or descriptions associated with the image. The term "content" in this context might refer to colors, shapes, textures, or any other information that can be derived from the image itself. CBIR is desirable because searches that rely purely on metadata are dependent on annotation quality and completeness. == Comparison with metadata searching == An image meta search requires humans to have manually annotated images by entering keywords or metadata in a large database, which can be time-consuming and may not capture the keywords desired to describe the image. The evaluation of the effectiveness of keyword image search is subjective and has not been well-defined. In the same regard, CBIR systems have similar challenges in defining success. "Keywords also limit the scope of queries to the set of predetermined criteria." and, "having been set up" are less reliable than using the content itself. == History == The term "content-based image retrieval" seems to have originated in 1992 when it was used by Japanese Electrotechnical Laboratory engineer Toshikazu Kato to describe experiments into automatic retrieval of images from a database, based on the colors and shapes present. Since then, the term has been used to describe the process of retrieving desired images from a large collection on the basis of syntactical image features. The techniques, tools, and algorithms that are used originate from fields such as statistics, pattern recognition, signal processing, and computer vision. === QBIC - Query By Image Content === The earliest commercial CBIR system was developed by IBM and was called QBIC (Query By Image Content). Recent network- and graph-based approaches have presented a simple and attractive alternative to existing methods. While the storing of multiple images as part of a single entity preceded the term BLOB (Binary Large OBject), the ability to fully search by content, rather than by description, had to await IBM's QBIC. === VisualRank === == Technical progress == The interest in CBIR has grown because of the limitations inherent in metadata-based systems, as well as the large range of possible uses for efficient image retrieval. Textual information about images can be easily searched using existing technology, but this requires humans to manually describe each image in the database. This can be impractical for very large databases or for images that are generated automatically, e.g. those from surveillance cameras. It is also possible to miss images that use different synonyms in their descriptions. Systems based on categorizing images in semantic classes like "cat" as a subclass of "animal" can avoid the miscategorization problem, but will require more effort by a user to find images that might be "cats", but are only classified as an "animal". Many standards have been developed to categorize images, but all still face scaling and miscategorization issues. Initial CBIR systems were developed to search databases based on image color, texture, and shape properties. After these systems were developed, the need for user-friendly interfaces became apparent. Therefore, efforts in the CBIR field started to include human-centered design that tried to meet the needs of the user performing the search. This typically means inclusion of: query methods that may allow descriptive semantics, queries that may involve user feedback, systems that may include machine learning, and systems that may understand user satisfaction levels. == Techniques == Many CBIR systems have been developed, but as of 2006, the problem of retrieving images on the basis of their pixel content remains largely unsolved. Different query techniques and implementations of CBIR make use of different types of user queries. === Query By Example === QBE (Query By Example) is a query technique that involves providing the CBIR system with an example image that it will then base its search upon. The underlying search algorithms may vary depending on the application, but result images should all share common elements with the provided example. Options for providing example images to the system include: A preexisting image may be supplied by the user or chosen from a random set. The user draws a rough approximation of the image they are looking for, for example with blobs of color or general shapes. This query technique removes the difficulties that can arise when trying to describe images with words. === Semantic retrieval === Semantic retrieval starts with a user making a request like "find pictures of Abraham Lincoln". This type of open-ended task is very difficult for computers to perform - Lincoln may not always be facing the camera or in the same pose. Many CBIR systems therefore generally make use of lower-level features like texture, color, and shape. These features are either used in combination with interfaces that allow easier input of the criteria or with databases that have already been trained to match features (such as faces, fingerprints, or shape matching). However, in general, image retrieval requires human feedback in order to identify higher-level concepts. === Relevance feedback (human interaction) === Combining CBIR search techniques available with the wide range of potential users and their intent can be a difficult task. An aspect of making CBIR successful relies entirely on the ability to understand the user intent. CBIR systems can make use of relevance feedback, where the user progressively refines the search results by marking images in the results as "relevant", "not relevant", or "neutral" to the search query, then repeating the search with the new information. Examples of this type of interface have been developed. === Iterative/machine learning === Machine learning and application of iterative techniques are becoming more common in CBIR. === Other query methods === Other query methods include browsing for example images, navigating customized/hierarchical categories, querying by image region (rather than the entire image), querying by multiple example images, querying by visual sketch, querying by direct specification of image features, and multimodal queries (e.g. combining touch, voice, etc.) == Content comparison using image distance measures == The most common method for comparing two images in content-based image retrieval (typically an example image and an image from the database) is using an image distance measure. An image distance measure compares the similarity of two images in various dimensions such as color, texture, shape, and others. For example, a distance of 0 signifies an exact match with the query, with respect to the dimensions that were considered. As one may intuitively gather, a value greater than 0 indicates various degrees of similarities between the images. Search results then can be sorted based on their distance to the queried image. Many measures of image distance (Similarity Models) have been developed. === Color === Computing distance measures based on color similarity is achieved by computing a color histogram for each image that identifies the proportion of pixels within an image holding specific values. Examining images based on the colors they contain is one of the most widely used techniques because it can be completed without regard to image size or orientation. However, research has also attempted to segment color proportion by region and by spatial relationship among several color regions. === Texture === Texture measures look for visual patterns in images and how they are spatially defined. Textures are represented by texels which are then placed into a number of sets, depending on how many textures are detected in the image. These sets not only define the texture, but also where in the image the texture is located. Texture is a difficult concept to represent. The identification of specific textures in an image is achieved primarily by modeling texture as a two-dimensional gray level variation. The relative brightness of pairs of pixels is computed such that degree of contrast, regularity, coarseness and directionality may be estimated. The problem is in identifying patterns of co-pixel variation and associating them with particular classes of textures such as silky, or rough. Other methods of classifying textures include: Co-occurrence matrix Laws texture energy Wavelet transform Orthogonal transforms (discrete Chebyshev moments) =
Thompson sampling
Thompson sampling, named after William R. Thompson, is a heuristic for choosing actions that address the exploration–exploitation dilemma in the multi-armed bandit problem. It consists of choosing the action that maximizes the expected reward with respect to a randomly drawn belief. == Description == Consider a set of contexts X {\displaystyle {\mathcal {X}}} , a set of actions A {\displaystyle {\mathcal {A}}} , and rewards in R {\displaystyle \mathbb {R} } . The aim of the player is to play actions under the various contexts, such as to maximize the cumulative rewards. Specifically, in each round, the player obtains a context x ∈ X {\displaystyle x\in {\mathcal {X}}} , plays an action a ∈ A {\displaystyle a\in {\mathcal {A}}} and receives a reward r ∈ R {\displaystyle r\in \mathbb {R} } following a distribution that depends on the context and the issued action. The elements of Thompson sampling are as follows: a likelihood function P ( r | θ , a , x ) {\displaystyle P(r|\theta ,a,x)} ; a set Θ {\displaystyle \Theta } of parameters θ {\displaystyle \theta } of the distribution of r {\displaystyle r} ; a prior distribution P ( θ ) {\displaystyle P(\theta )} on these parameters; past observations triplets D = { ( x ; a ; r ) } {\displaystyle {\mathcal {D}}=\{(x;a;r)\}} ; a posterior distribution P ( θ | D ) ∝ P ( D | θ ) P ( θ ) {\displaystyle P(\theta |{\mathcal {D}})\propto P({\mathcal {D}}|\theta )P(\theta )} , where P ( D | θ ) {\displaystyle P({\mathcal {D}}|\theta )} is the likelihood function. Thompson sampling consists of playing the action a ∗ ∈ A {\displaystyle a^{\ast }\in {\mathcal {A}}} according to the probability that it maximizes the expected reward; action a ∗ {\displaystyle a^{\ast }} is chosen with probability ∫ I [ E ( r | a ∗ , x , θ ) = max a ′ E ( r | a ′ , x , θ ) ] P ( θ | D ) d θ , {\displaystyle \int \mathbb {I} \left[\mathbb {E} (r|a^{\ast },x,\theta )=\max _{a'}\mathbb {E} (r|a',x,\theta )\right]P(\theta |{\mathcal {D}})d\theta ,} where I {\displaystyle \mathbb {I} } is the indicator function. In practice, the rule is implemented by sampling. In each round, parameters θ ∗ {\displaystyle \theta ^{\ast }} are sampled from the posterior P ( θ | D ) {\displaystyle P(\theta |{\mathcal {D}})} , and an action a ∗ {\displaystyle a^{\ast }} chosen that maximizes E [ r | θ ∗ , a ∗ , x ] {\displaystyle \mathbb {E} [r|\theta ^{\ast },a^{\ast },x]} , i.e. the expected reward given the sampled parameters, the action, and the current context. Conceptually, this means that the player instantiates their beliefs randomly in each round according to the posterior distribution, and then acts optimally according to them. In most practical applications, it is computationally onerous to maintain and sample from a posterior distribution over models. As such, Thompson sampling is often used in conjunction with approximate sampling techniques. == History == Thompson sampling was originally described by Thompson in 1933. It was subsequently rediscovered numerous times independently in the context of multi-armed bandit problems. A first proof of convergence for the bandit case has been shown in 1997. The first application to Markov decision processes was in 2000. A related approach (see Bayesian control rule) was published in 2010. In 2010 it was also shown that Thompson sampling is instantaneously self-correcting. Asymptotic convergence results for contextual bandits were published in 2011. Thompson Sampling has been widely used in many online learning problems including A/B testing in website design and online advertising, and accelerated learning in decentralized decision making. A Double Thompson Sampling (D-TS) algorithm has been proposed for dueling bandits, a variant of traditional MAB, where feedback comes in the form of pairwise comparison. == Relationship to other approaches == === Probability matching === Probability matching is a decision strategy in which predictions of class membership are proportional to the class base rates. Thus, if in the training set positive examples are observed 60% of the time, and negative examples are observed 40% of the time, the observer using a probability-matching strategy will predict (for unlabeled examples) a class label of "positive" on 60% of instances, and a class label of "negative" on 40% of instances. === Bayesian control rule === A generalization of Thompson sampling to arbitrary dynamical environments and causal structures, known as Bayesian control rule, has been shown to be the optimal solution to the adaptive coding problem with actions and observations. In this formulation, an agent is conceptualized as a mixture over a set of behaviours. As the agent interacts with its environment, it learns the causal properties and adopts the behaviour that minimizes the relative entropy to the behaviour with the best prediction of the environment's behaviour. If these behaviours have been chosen according to the maximum expected utility principle, then the asymptotic behaviour of the Bayesian control rule matches the asymptotic behaviour of the perfectly rational agent. The setup is as follows. Let a 1 , a 2 , … , a T {\displaystyle a_{1},a_{2},\ldots ,a_{T}} be the actions issued by an agent up to time T {\displaystyle T} , and let o 1 , o 2 , … , o T {\displaystyle o_{1},o_{2},\ldots ,o_{T}} be the observations gathered by the agent up to time T {\displaystyle T} . Then, the agent issues the action a T + 1 {\displaystyle a_{T+1}} with probability: P ( a T + 1 | a ^ 1 : T , o 1 : T ) , {\displaystyle P(a_{T+1}|{\hat {a}}_{1:T},o_{1:T}),} where the "hat"-notation a ^ t {\displaystyle {\hat {a}}_{t}} denotes the fact that a t {\displaystyle a_{t}} is a causal intervention (see Causality), and not an ordinary observation. If the agent holds beliefs θ ∈ Θ {\displaystyle \theta \in \Theta } over its behaviors, then the Bayesian control rule becomes P ( a T + 1 | a ^ 1 : T , o 1 : T ) = ∫ Θ P ( a T + 1 | θ , a ^ 1 : T , o 1 : T ) P ( θ | a ^ 1 : T , o 1 : T ) d θ {\displaystyle P(a_{T+1}|{\hat {a}}_{1:T},o_{1:T})=\int _{\Theta }P(a_{T+1}|\theta ,{\hat {a}}_{1:T},o_{1:T})P(\theta |{\hat {a}}_{1:T},o_{1:T})\,d\theta } , where P ( θ | a ^ 1 : T , o 1 : T ) {\displaystyle P(\theta |{\hat {a}}_{1:T},o_{1:T})} is the posterior distribution over the parameter θ {\displaystyle \theta } given actions a 1 : T {\displaystyle a_{1:T}} and observations o 1 : T {\displaystyle o_{1:T}} . In practice, the Bayesian control amounts to sampling, at each time step, a parameter θ ∗ {\displaystyle \theta ^{\ast }} from the posterior distribution P ( θ | a ^ 1 : T , o 1 : T ) {\displaystyle P(\theta |{\hat {a}}_{1:T},o_{1:T})} , where the posterior distribution is computed using Bayes' rule by only considering the (causal) likelihoods of the observations o 1 , o 2 , … , o T {\displaystyle o_{1},o_{2},\ldots ,o_{T}} and ignoring the (causal) likelihoods of the actions a 1 , a 2 , … , a T {\displaystyle a_{1},a_{2},\ldots ,a_{T}} , and then by sampling the action a T + 1 ∗ {\displaystyle a_{T+1}^{\ast }} from the action distribution P ( a T + 1 | θ ∗ , a ^ 1 : T , o 1 : T ) {\displaystyle P(a_{T+1}|\theta ^{\ast },{\hat {a}}_{1:T},o_{1:T})} . === Upper-confidence-bound (UCB) algorithms === Thompson sampling and upper-confidence bound algorithms share a fundamental property that underlies many of their theoretical guarantees. Roughly speaking, both algorithms allocate exploratory effort to actions that might be optimal and are in this sense "optimistic". Leveraging this property, one can translate regret bounds established for UCB algorithms to Bayesian regret bounds for Thompson sampling or unify regret analysis across both these algorithms and many classes of problems.
AI art
Artificial intelligence visual art, or AI art, is visual artwork generated or enhanced through the implementation of artificial intelligence (AI) programs, most commonly using text-to-image models. The process of automated art-making has existed since antiquity. The field of artificial intelligence was founded in the 1950s, and artists began to create art with artificial intelligence shortly after the discipline's founding. A select number of these creations have been showcased in museums and have been recognized with awards. Throughout its history, AI has raised many philosophical questions related to the human mind, artificial beings, and the nature of art in human–AI collaboration. During the AI boom of the 2020s, text-to-image models such as Midjourney, DALL-E and Stable Diffusion became widely available to the public, allowing users to quickly generate imagery with little effort. Commentary about AI art in the 2020s has often focused on issues related to copyright, deception, defamation, and its impact on more traditional artists, including technological unemployment. In August 2023, the US Supreme Court ruled that AI art is ineligible for copyright due to failure to meet human authorship. In March 2026, it declined to hear a case over whether AI-generated art can be subject to copyright. == History == === Early history === Automated art dates back at least to the automata of ancient Greek civilization, when inventors such as Daedalus and Hero of Alexandria were described as designing machines capable of writing text, generating sounds, and playing music. Creative automatons have flourished throughout history, such as Maillardet's automaton, created around 1800 and capable of creating multiple drawings and poems. Also in the 19th century, Ada Lovelace, wrote that "computing operations" could potentially be used to generate music and poems. In 1950, Alan Turing's paper "Computing Machinery and Intelligence" focused on whether machines can mimic human behavior convincingly. Shortly after, the academic discipline of artificial intelligence was founded at a research workshop at Dartmouth College in 1956. Since its founding, AI researchers have explored philosophical questions about the nature of the human mind and the consequences of creating artificial beings with human-like intelligence; these issues have previously been explored by myth, fiction, and philosophy since antiquity. === Artistic history === Since the founding of AI in the 1950s, artists have used artificial intelligence to create artistic works. These works were sometimes referred to as algorithmic art, computer art, digital art, or new media art. One of the first significant AI art systems is AARON, developed by Harold Cohen beginning in the late 1960s at the University of California at San Diego. AARON uses a symbolic rule-based approach to generate technical images in the era of GOFAI programming, and it was developed by Cohen with the goal of being able to code the act of drawing. AARON was exhibited in 1972 at the Los Angeles County Museum of Art. From 1973 to 1975, Cohen refined AARON during a residency at the Artificial Intelligence Laboratory at Stanford University. In 2024, the Whitney Museum of American Art exhibited AI art from throughout Cohen's career, including re-created versions of his early robotic drawing machines. Karl Sims has exhibited art created with artificial life since the 1980s. He received an M.S. in computer graphics from the MIT Media Lab in 1987 and was artist-in-residence from 1990 to 1996 at the supercomputer manufacturer and artificial intelligence company Thinking Machines. In both 1991 and 1992, Sims won the Golden Nica award at Prix Ars Electronica for his videos using artificial evolution. In 1997, Sims created the interactive artificial evolution installation Galápagos for the NTT InterCommunication Center in Tokyo. Sims received an Emmy Award in 2019 for outstanding achievement in engineering development. In 1999, Scott Draves and a team of several engineers created and released Electric Sheep as a free software screensaver. Electric Sheep is a volunteer computing project for animating and evolving fractal flames, which are distributed to networked computers that display them as a screensaver. The screensaver used AI to create an infinite animation by learning from its audience. In 2001, Draves won the Fundacion Telefónica Life 4.0 prize for Electric Sheep. In 2014, Stephanie Dinkins began working on Conversations with Bina48. For the series, Dinkins recorded her conversations with BINA48, a social robot that resembles a middle-aged black woman. In 2019, Dinkins won the Creative Capital award for her creation of an evolving artificial intelligence based on the "interests and culture(s) of people of color." In 2015, Sougwen Chung began Mimicry (Drawing Operations Unit: Generation 1), an ongoing collaboration between the artist and a robotic arm. In 2019, Chung won the Lumen Prize for her continued performances with a robotic arm that uses AI to attempt to draw in a manner similar to Chung. In 2018, an auction sale of artificial intelligence art was held at Christie's in New York where the AI artwork Edmond de Belamy sold for US$432,500, which was almost 45 times higher than its estimate of US$7,000–10,000. The artwork was created by Obvious, a Paris-based collective. In 2024, Japanese film generAIdoscope was released. The film was co-directed by Hirotaka Adachi, Takeshi Sone, and Hiroki Yamaguchi. All video, audio, and music in the film were created with artificial intelligence. In 2025, the Japanese anime television series Twins Hinahima was released. The anime was produced and animated with AI assistance during the process of cutting and conversion of photographs into anime illustrations and later retouched by art staff. Most of the remaining parts such as characters and logos were hand-drawn with various software. === Technical history === Deep learning, characterized by its multi-layer structure that attempts to mimic the human brain, first came about in the 2010s, causing a significant shift in the world of AI art. During the deep learning era, there are mainly these types of designs for generative art: autoregressive models, diffusion models, GANs, normalizing flows. In 2014, Ian Goodfellow and colleagues at Université de Montréal developed the generative adversarial network (GAN), a type of deep neural network capable of learning to mimic the statistical distribution of input data such as images. The GAN uses a "generator" to create new images and a "discriminator" to decide which created images are considered successful. Unlike previous algorithmic art that followed hand-coded rules, generative adversarial networks could learn a specific aesthetic by analyzing a dataset of example images. In 2015, a team at Google released DeepDream, a program that uses a convolutional neural network to find and enhance patterns in images via algorithmic pareidolia. The process creates deliberately over-processed images with a dream-like appearance reminiscent of a psychedelic experience. Later, in 2017, a conditional GAN learned to generate 1000 image classes of ImageNet, a large visual database designed for use in visual object recognition software research. By conditioning the GAN on both random noise and a specific class label, this approach enhanced the quality of image synthesis for class-conditional models. Autoregressive models were used for image generation, such as PixelRNN (2016), which autoregressively generates one pixel after another with a recurrent neural network. Immediately after the Transformer architecture was proposed in Attention Is All You Need (2018), it was used for autoregressive generation of images, but without text conditioning. The website Artbreeder, launched in 2018, uses the models StyleGAN and BigGAN to allow users to generate and modify images such as faces, landscapes, and paintings. In the 2020s, text-to-image models, which generate images based on prompts, became widely used, marking yet another shift in the creation of AI-generated artworks. In 2021, using the influential large language generative pre-trained transformer models that are used in GPT-2 and GPT-3, OpenAI released a series of images created with the text-to-image AI model DALL-E 1. It is an autoregressive generative model with essentially the same architecture as GPT-3. Along with this, later in 2021, EleutherAI released the open source VQGAN-CLIP based on OpenAI's CLIP model. Diffusion models, generative models used to create synthetic data based on existing data, were first proposed in 2015, but they only became better than GANs in early 2021. Latent diffusion model was published in December 2021 and became the basis for the later Stable Diffusion (August 2022), developed through a collaboration between Stability AI, CompVis Group at LMU Munich, and Runway. In 2022, Midjourney was released, followed by Google Brain's Imagen and Pa
Active learning (machine learning)
Active learning is a special case of machine learning in which a learning algorithm can interactively query a human user (or some other information source) to label new data points with the desired outputs. The human user must possess expertise in the problem domain, including the ability to consult authoritative sources when necessary. In statistics literature, it is sometimes also called optimal experimental design. The information source is also called teacher or oracle. There are situations in which unlabeled data is abundant but manual labeling is expensive. In such a scenario, learning algorithms can actively query the teacher for labels. Since the learner chooses the examples, the number of examples to learn a concept can often be much lower than the number required in normal supervised learning. However, there is a risk that the algorithm is overwhelmed by uninformative examples. Recent developments are dedicated to multi-label active learning, hybrid active learning and active learning in a single-pass (on-line) context, combining concepts from the field of machine learning (e.g. conflict and ignorance) with adaptive, incremental learning policies in the field of online machine learning. Using active learning allows for faster development of a machine learning algorithm, when comparative updates would require a quantum or super computer. Large-scale active learning projects may benefit from crowdsourcing frameworks such as Amazon Mechanical Turk that include many humans in the active learning loop. == Definitions == Let T be the total set of all data under consideration. For example, in a protein engineering problem, T would include all proteins that are known to have a certain interesting activity and all additional proteins that one might want to test for that activity. During each iteration, i, T is broken up into three subsets T K , i {\displaystyle \mathbf {T} _{K,i}} : Data points where the label is known. T U , i {\displaystyle \mathbf {T} _{U,i}} : Data points where the label is unknown. T C , i {\displaystyle \mathbf {T} _{C,i}} : A subset of TU,i that is chosen to be labeled. Most of the current research in active learning involves the best method to choose the data points for TC,i. == Scenarios == Pool-based sampling: In this approach, which is the most well known scenario, the learning algorithm attempts to evaluate the entire dataset before selecting data points (instances) for labeling. It is often initially trained on a fully labeled subset of the data using a machine-learning method such as logistic regression or SVM that yields class-membership probabilities for individual data instances. The candidate instances are those for which the prediction is most ambiguous. Instances are drawn from the entire data pool and assigned a confidence score, a measurement of how well the learner "understands" the data. The system then selects the instances for which it is the least confident and queries the teacher for the labels. The theoretical drawback of pool-based sampling is that it is memory-intensive and is therefore limited in its capacity to handle enormous datasets, but in practice, the rate-limiting factor is that the teacher is typically a (fatiguable) human expert who must be paid for their effort, rather than computer memory. Stream-based selective sampling: Here, each consecutive unlabeled instance is examined one at a time with the machine evaluating the informativeness of each item against its query parameters. The learner decides for itself whether to assign a label or query the teacher for each datapoint. As contrasted with Pool-based sampling, the obvious drawback of stream-based methods is that the learning algorithm does not have sufficient information, early in the process, to make a sound assign-label-vs ask-teacher decision, and it does not capitalize as efficiently on the presence of already labeled data. Therefore, the teacher is likely to spend more effort in supplying labels than with the pool-based approach. Membership query synthesis: This is where the learner generates synthetic data from an underlying natural distribution. For example, if the dataset are pictures of humans and animals, the learner could send a clipped image of a leg to the teacher and query if this appendage belongs to an animal or human. This is particularly useful if the dataset is small. The challenge here, as with all synthetic-data-generation efforts, is in ensuring that the synthetic data is consistent in terms of meeting the constraints on real data. As the number of variables/features in the input data increase, and strong dependencies between variables exist, it becomes increasingly difficult to generate synthetic data with sufficient fidelity. For example, to create a synthetic data set for human laboratory-test values, the sum of the various white blood cell (WBC) components in a white blood cell differential must equal 100, since the component numbers are really percentages. Similarly, the enzymes alanine transaminase (ALT) and aspartate transaminase (AST) measure liver function (though AST is also produced by other tissues, e.g., lung, pancreas) A synthetic data point with AST at the lower limit of normal range (8–33 units/L) with an ALT several times above normal range (4–35 units/L) in a simulated chronically ill patient would be physiologically impossible. == Query strategies == Algorithms for determining which data points should be labeled can be organized into a number of different categories, based upon their purpose: Balance exploration and exploitation: the choice of examples to label is seen as a dilemma between the exploration and the exploitation over the data space representation. This strategy manages this compromise by modelling the active learning problem as a contextual bandit problem. For example, Bouneffouf et al. propose a sequential algorithm named Active Thompson Sampling (ATS), which, in each round, assigns a sampling distribution on the pool, samples one point from this distribution, and queries the oracle for this sample point label. Expected model change: label those points that would most change the current model. Expected error reduction: label those points that would most reduce the model's generalization error. Exponentiated Gradient Exploration for Active Learning: In this paper, the author proposes a sequential algorithm named exponentiated gradient (EG)-active that can improve any active learning algorithm by an optimal random exploration. Uncertainty sampling: label those points for which the current model is least certain as to what the correct output should be. Query by committee: a variety of models are trained on the current labeled data, and vote on the output for unlabeled data; label those points for which the "committee" disagrees the most Querying from diverse subspaces or partitions: When the underlying model is a forest of trees, the leaf nodes might represent (overlapping) partitions of the original feature space. This offers the possibility of selecting instances from non-overlapping or minimally overlapping partitions for labeling. Variance reduction: label those points that would minimize output variance, which is one of the components of error. Conformal prediction: predicts that a new data point will have a label similar to old data points in some specified way and degree of the similarity within the old examples is used to estimate the confidence in the prediction. Mismatch-first farthest-traversal: The primary selection criterion is the prediction mismatch between the current model and nearest-neighbour prediction. It targets on wrongly predicted data points. The second selection criterion is the distance to previously selected data, the farthest first. It aims at optimizing the diversity of selected data. User-centered labeling strategies: Learning is accomplished by applying dimensionality reduction to graphs and figures like scatter plots. Then the user is asked to label the compiled data (categorical, numerical, relevance scores, relation between two instances). A wide variety of algorithms have been studied that fall into these categories. While the traditional AL strategies can achieve remarkable performance, it is often challenging to predict in advance which strategy is the most suitable in a particular situation. In recent years, meta-learning algorithms have been gaining in popularity. Some of them have been proposed to tackle the problem of learning AL strategies instead of relying on manually designed strategies. A benchmark which compares 'meta-learning approaches to active learning' to 'traditional heuristic-based Active Learning' may give intuitions if 'Learning active learning' is at the crossroads == Minimum marginal hyperplane == Some active learning algorithms are built upon support-vector machines (SVMs) and exploit the structure of the SVM to determine which data points to label. Such methods usually calculate the margin, W, of each u
Vagueness
In linguistics and philosophy, a vague predicate is one which gives rise to borderline cases. For example, the English adjective "tall" is vague since it is not clearly true or false for someone of middling height. By contrast, the word "prime" is not vague since every number is definitively either prime or not. Vagueness is commonly diagnosed by a predicate's ability to give rise to the sorites paradox. Vagueness is separate from ambiguity, in which an expression has multiple denotations. For instance the word "bank" is ambiguous since it can refer either to a river bank or to a financial institution, but there are no borderline cases between both interpretations. Vagueness is a major topic of research in philosophical logic, where it serves as a potential challenge to classical logic. Work in formal semantics has sought to provide a compositional semantics for vague expressions in natural language. Work in philosophy of language has addressed implications of vagueness for the theory of meaning, while metaphysicists have considered whether reality itself is vague. == Importance == The concept of vagueness has philosophical importance. Suppose one wants to come up with a definition of "right" in the moral sense. One wants a definition to cover actions that are clearly right and exclude actions that are clearly wrong, but what does one do with the borderline cases? Surely, there are such cases. Some philosophers say that one should try to come up with a definition that is itself unclear on just those cases. Others say that one has an interest in making his or her definitions more precise than ordinary language, or his or her ordinary concepts, themselves allow; they recommend one advances precising definitions. === In law === Vagueness is also a problem which arises in law, and in some cases, judges have to arbitrate regarding whether a borderline case does, or does not, satisfy a given vague concept. Examples include disability (how much loss of vision is required before one is legally blind?), human life (at what point from conception to birth is one a legal human being, protected for instance by laws against murder?), adulthood (most familiarly reflected in legal ages for driving, drinking, voting, consensual sex, etc.), race (how to classify someone of mixed racial heritage), etc. Even such apparently unambiguous concepts such as biological sex can be subject to vagueness problems, not just from transsexuals' gender transitions but also from certain genetic conditions which can give an individual mixed male and female biological traits (see intersex). In the common law system, vagueness is a possible legal defence against by-laws and other regulations. The legal principle is that delegated power cannot be used more broadly than the delegator intended. Therefore, a regulation may not be so vague as to regulate areas beyond what the law allows. Any such regulation would be "void for vagueness" and unenforceable. This principle is sometimes used to strike down municipal by-laws that forbid "explicit" or "objectionable" contents from being sold in a certain city; courts often find such expressions to be too vague, giving municipal inspectors discretion beyond what the law allows. In the US this is known as the vagueness doctrine and in Europe as the principle of legal certainty. === In science === Many scientific concepts are of necessity vague, for instance species in biology cannot be precisely defined, owing to unclear cases such as ring species. Nonetheless, the concept of species can be clearly applied in the vast majority of cases. As this example illustrates, to say that a definition is "vague" is not necessarily a criticism. Consider those animals in Alaska that are the result of breeding huskies and wolves: are they dogs? It is not clear: they are borderline cases of dogs. This means one's ordinary concept of doghood is not clear enough to let us rule conclusively in this case. == Approaches == The philosophical question of what the best theoretical treatment of vagueness is—which is closely related to the problem of the paradox of the heap, a.k.a. sorites paradox—has been the subject of much philosophical debate. === Fuzzy logic === One theoretical approach is that of fuzzy logic, developed by American mathematician Lotfi Zadeh. Fuzzy logic proposes a gradual transition between "perfect falsity", for example, the statement "Bill Clinton is bald", to "perfect truth", for, say, "Patrick Stewart is bald". In ordinary logics, there are only two truth-values: "true" and "false". The fuzzy perspective differs by introducing an infinite number of truth-values along a spectrum between perfect truth and perfect falsity. Perfect truth may be represented by "1", and perfect falsity by "0". Borderline cases are thought of as having a "truth-value" anywhere between 0 and 1 (for example, 0.6). Advocates of the fuzzy logic approach have included K. F. Machina (1976) and Dorothy Edgington (1993). === Supervaluationism === Another theoretical approach is known as "supervaluationism". This approach has been defended by Kit Fine and Rosanna Keefe. Fine argues that borderline applications of vague predicates are neither true nor false, but rather are instances of "truth value gaps". He defends an interesting and sophisticated system of vague semantics, based on the notion that a vague predicate might be "made precise" in many alternative ways. This system has the consequence that borderline cases of vague terms yield statements that are neither true, nor false. Given a supervaluationist semantics, one can define the predicate "supertrue" as meaning "true on all precisifications". This predicate will not change the semantics of atomic statements (e.g. "Frank is bald", where Frank is a borderline case of baldness), but does have consequences for logically complex statements. In particular, the tautologies of sentential logic, such as "Frank is bald or Frank is not bald", will turn out to be supertrue, since on any precisification of baldness, either "Frank is bald" or "Frank is not bald" will be true. Since the presence of borderline cases seems to threaten principles like this one (excluded middle), the fact that supervaluationism can "rescue" them is seen as a virtue. === Subvaluationism === Subvaluationism is the logical dual of supervaluationism, and has been defended by Dominic Hyde (2008) and Pablo Cobreros (2011). Whereas the supervaluationist characterises truth as 'supertruth', the subvaluationist characterises truth as 'subtruth', or "true on at least some precisifications". Subvaluationism proposes that borderline applications of vague terms are both true and false. It thus has "truth-value gluts". According to this theory, a vague statement is true if it is true on at least one precisification and false if it is false under at least one precisification. If a vague statement comes out true under one precisification and false under another, it is both true and false. Subvaluationism ultimately amounts to the claim that vagueness is a truly contradictory phenomenon. Of a borderline case of "bald man" it would be both true and false to say that he is bald, and both true and false to say that he is not bald. === Epistemicist view === A fourth approach, known as "the epistemicist view", has been defended by Timothy Williamson (1994), R. A. Sorensen (1988) and (2001), and Nicholas Rescher (2009). They maintain that vague predicates do, in fact, draw sharp boundaries, but that one cannot know where these boundaries lie. One's confusion about whether some vague word does or does not apply in a borderline case is due to one's ignorance. For example, in the epistemicist view, there is a fact of the matter, for every person, about whether that person is old or not old; some people are ignorant of this fact. === As a property of objects === One possibility is that one's words and concepts are perfectly precise, but that objects themselves are vague. Consider Peter Unger's example of a cloud (from his famous 1980 paper, "The Problem of the Many"): it is not clear where the boundary of a cloud lies; for any given bit of water vapor, one can ask whether it is part of the cloud or not, and for many such bits, one will not know how to answer. Hence, perhaps such a term as 'cloud' is not itself vague, but rather precisely denotes a vague object. This strategy has occasionally been poorly received; most notably, in Gareth Evans' short paper "Can There Be Vague Objects?" (1978), wherein an argument is examined which appears to show that vague identity-statements are impossible (i.e., result in logical incoherence). David Lewis explains that the reader is intended to conclude, with Evans, that—since there clearly are, in fact, meaningful vague identities—any purported proof to the contrary cannot be right; and as the proof relies upon the premise that vague terms precisely denote vague objects, but fails under the view that vague terms reflect a merel