Accounting for real world phenomena in machine learning and mechanism design
As data becomes more readily available, individuals and organisations are increasingly relying on automated systems to make decisions on their behalf. Both machine learning and mechanism design play key roles in the design of such systems. Machine learning is often deployed to learn complex decision rules that mimic or improve upon those adopted by humans. Meanwhile, mechanism design is often deployed to ensure that decision rules satisfy certain axiomatic properties of interest to the designer, such as fairness and incentive compatibility. Unfortunately, many real world settings fall outside the scope of traditional machine learning and mechanism design frameworks. This thesis investigates how approaches from mechanism design and machine learning can be rigorously extended and adapted for such settings to yield meaningful theoretical guarantees.In particular, we investigate three problem domains; 1) linear regression in the presence of strategic agents, 2) sequential resource deployment with reusable resources and 3) repeated matching with reusable resources. For the first problem domain, we provide a theoretical framework based on Stackelberg predictions games. When the incentives of agents can be captured by a square loss function, we provide a polynomial time algorithm minimising Stackelberg risk, a natural analog to risk in classical supervised learning. For the second problem domain, we introduce a new multi-armed bandit model, called the adversarial blocking bandit problem, which incorporates nonstationary reward sequences and resource unavailability. In particular, we provide finite-time regret guarantees for this setting, by benchmarking against an oracle algorithm which approximates the optimal arm pulling policy. Lastly, for the third problem domain, we introduce a new sequential matching setting, in which a central planner is tasked with constructing matchings repeatedly through time under the assumption that some goods or services may become temporarily unavailable once assigned. Motivated by the random serial dictatorship algorithm, we construct an algorithm for the setting which is approximately truthful and approximately maximises social welfare.
https://eprints.soton.ac.uk/474341/
https://eprints.soton.ac.uk/474341/1/Thesis_3au.pdf