交叉科学研究院分享会

2025年6月8日（周日）；

上午 9 : 00 - 12 : 00

信息管理与工程学院102室

上海财经大学（第三教学楼西侧）

上海市杨浦区武东路100号

Plenary Talk

报告 ①

姓名：欧文帆

导师：毕晟

研究方向：数据驱动决策算法

报告人简介：2021级管理科学与工程硕博连读生

When Actions and Rewards Mismatch in Intermittency: Imitation and Event-Driven Temporal Difference

Abstract

Mismatches between actions and rewards often result in poor performance of reinforcement learning. With the surprising findings that general deep reinforcement learning algorithms struggle in the intermittent environment, we identify two key effects that cause mismatches in the training process: the overwhelming effect and the lasting effect. To mitigate the impact of the mismatches brought by both effects, we propose a novel algorithmic framework, EVent-driven temporal differencE with imitatioN Training (EVENT). EVENT integrates deep learning-based imitation learning with an event-driven temporal difference, which features automatically adaptive backup lengths in updates. In this paper, we investigate a notorious problem of inventory control for long-tail items, where the demand is intermittent and the optimal policies remain elusive. We introduce a myopic heuristic and embed it into EVENT to tackle the problem. Our algorithm design is heavily inspired by the structural analysis and modeling from operations management, and superior numerical results highlight the value of structural knowledge to motivate the algorithm design.

Plenary Talk

报告 ②

姓名：夏婧凡

导师：邓琪

研究方向：数据驱动决策算法

报告人简介：2021级管理科学与工程硕博连读生

Revisiting Randomized Smoothing: Nonsmooth Nonconvex Optimization Beyond Global Lipschitz Continuity

Abstract

Randomized smoothing is a widely adopted technique for optimizing nonsmooth objective functions. However, its efficiency analysis typically relies on global Lipschitz continuity, a condition rarely met in practical applications. To address this limitation,

we introduce a new subgradient growth condition that naturally encompasses a wide range of locally Lipschitz functions, with the classical global Lipschitz function as a special case.

Under this milder condition, we prove that randomized smoothing yields a differentiable function that satisfies certain generalized smoothness properties. To optimize such functions, we propose novel randomized smoothing gradient algorithms that, with high probability, converge to $(\delta, \epsilon)$-Goldstein stationary points and achieve a sample complexity of $\tilde{\mathcal{O}}(d^{5/2}\delta^{-1}\epsilon^{-4})$. By incorporating variance reduction techniques, we further improve the sample complexity to $\tilde{\mathcal{O}}(d^{3/2}\delta^{-1}\epsilon^{-3})$, matching the optimal $\epsilon$-bound under the global Lipschitz assumption, up to a logarithmic factor. Experimental results validate the effectiveness of our proposed algorithms.

Plenary Talk

报告 ③

姓名：陈志恒

导师：毕晟

研究方向：基于数据驱动的运营管理

报告人简介：2023级管理科学与工程直博生

Balancing Revenue and Public Health: Promotion Design in Fast-Food Collect-to-Win Campaigns

Abstract

Collect-to-win games, often employed by fast-food chains, are short-term promotions designed to boost revenue and customer engagement. In these games, customers collect game pieces attached to the packaging of several promotional products to win prizes. Unlike traditional promotions like price discounts, these games harness the “Goal Gradient Effect”, whereby the incentive to purchase increases as customers approach a reward. Although the promotion significantly boosts fast-food sales, it has also sparks considerable controversy due to its potential negative impact on public health. This paper addresses this issue by strategically designing the promotion to balance revenue and public health. Specifically, we select appropriate products to promote and determine an optimal promotion duration. We model customers' dynamic choice behavior using a Polya process and impose covering constraints to ensure that the product assortment includes a minimum number of low-calorie products. We show that a new class of assortments, called quasi-revenue-ordered assortments, which consist of a revenue-ordered assortment plus several additional items, are optimal. Finally, we conduct a numerical experiment using real data from the Australian market in 2017. The results show that the average calorie content per product purchased can be significantly reduced with only a minor sacrifice in revenue.

微信号｜SUFE_RIIS

扫描二维码关注我们

RIIS

RESEARCH INSTITUTE for INTERDISCIPLINARY SCIENCES